Re: CHIPPro NAND issue with 4.12 rc1

2017-05-20 Thread Boris Brezillon
Le Sat, 20 May 2017 15:24:06 -0600,
Angus Ainslie  a écrit :

> On 2017-05-20 09:14, Boris Brezillon wrote:
> > Le Sat, 20 May 2017 08:49:04 -0600,
> > Angus Ainslie  a écrit :
> >   
> >> Hi All,
> >> 
> >> I'm trying to boot a CHIPPro with the stock 4.12 rc1 kernel. If I make
> >> no modifications to the sun5i-gr8-chip-pro.dtb the kernel boots but
> >> can't find the root partition.
> >> 
> >> So I added the partitions to the dts file
> >> 
> >> diff --git a/arch/arm/boot/dts/sun5i-gr8-chip-pro.dts
> >> b/arch/arm/boot/dts/sun5i-gr8-chip-pro.dts
> >> index c55b11a..0e61e6b 100644
> >> --- a/arch/arm/boot/dts/sun5i-gr8-chip-pro.dts
> >> +++ b/arch/arm/boot/dts/sun5i-gr8-chip-pro.dts
> >> @@ -146,6 +146,32 @@
> >>  reg = <0>;
> >>  allwinner,rb = <0>;
> >>  nand-ecc-mode = "hw";
> >> +nand-on-flash-bbt;
> >> +
> >> +spl@0 {
> >> +  label = "SPL";
> >> +  reg = /bits/ 64 <0x0 0x40>;
> >> +};
> >> +
> >> +spl-backup@40 {
> >> +  label = "SPL.backup";
> >> +  reg = /bits/ 64 <0x40 0x40>;
> >> +};
> >> +
> >> +u-boot@80 {
> >> +  label = "U-Boot";
> >> +  reg = /bits/ 64 <0x80 0x40>;
> >> +};
> >> +
> >> +env@c0 {
> >> +  label = "env";
> >> +  reg = /bits/ 64 <0xc0 0x40>;
> >> +};
> >> +
> >> +rootfs@100 {
> >> +  label = "rootfs";
> >> +  reg = /bits/ 64 <0x100 0x1f00>;
> >> +};
> >>  };
> >>   };
> >> 
> >> and now the kernel finds the partition but it times out trying to 
> >> mount
> >> it. It seems to be something in the dts files because if I use the
> >> ntc-gr8-crumb.dts from the ntc 4.4.30 kernel then the system boots all
> >> the way to userland.  
> > 
> > Hm, that's weird. Just changing the dtb makes it work? Did you try to
> > dump both dtbs and figure out what else changes?
> >   
> 
> Yeah I thought it was weird too. I was thinking that maybe the pin muxes 
> were getting changed and the rb net or the interrupt net was getting 
> changed to a different function.
> 
> I did decompile to 2 dtb's and I couldn't find many differences. They 
> were mostly around some pull ups and drive strength for some of the NAND 
> and i2c pins. I tried adding those changes and it still didn't work so I 
> went back to the minimal set of changes to reproduce the bug.
> 
> > Also, I wonder how the NAND is correctly detected without this patch
> > [1].
> >   
> 
> 
> That patch seems to be in my 4.12-rc1 kernel, I have a definition for 
> the TC58NVG2S0H.
> 
> >> 
> >> [7.13] ubi0: scanning is finished
> >> [7.15] ubi0: attached mtd4 (name "rootfs", size 496 MiB)
> >> [7.16] ubi0: PEB size: 262144 bytes (256 KiB), LEB size: 
> >> 258048
> >> bytes
> >> [7.17] ubi0: min./max. I/O unit sizes: 4096/4096, sub-page 
> >> size
> >> 1024
> >> [7.18] ubi0: VID header offset: 1024 (aligned 1024), data
> >> offset: 4096
> >> [7.19] ubi0: good PEBs: 1977, bad PEBs: 7, corrupted PEBs: 0
> >> [7.20] ubi0: user volume: 1, internal volumes: 1, max. volumes
> >> count: 128
> >> [7.21] ubi0: max/mean erase counter: 3/1, WL threshold: 4096,
> >> image sequence number: 177407
> >> [7.22] ubi0: available PEBs: 1, total reserved PEBs: 1976, 
> >> PEBs
> >> reserved for bad PEB handling: 33  
> > 
> > UBI attach works...
> >   
> >> [7.24] hctosys: unable to open rtc device (rtc0)
> >> [7.25] vcc3v0: disabling

Interestingly, it starts failing after the core disables all unused
regulators. Not sure this is related but that's worth having a look.

I looked at the schematics and it seems VCC-3V3 (which is powering the
NAND chip) is enabled with the EXTEN pin of the AXP209 [1]. I don't know
if this pin is controlled by Linux, but maybe you can dump register
0x12 and check if EXTEN is set to 1.

> >> [7.25] ALSA device list:
> >> [7.26]   #0: sun4i-codec
> >> [7.26] ubi0: background thread "ubi_bgt0d" started, PID 53
> >> [8.32] sunxi_nand 1c03000.nand: wait interrupt timedout
> >> [9.32] sunxi_nand 1c03000.nand: wait interrupt timedout
> >> [   10.33] sunxi_nand 1c03000.nand: wait for empty cmd FIFO 
> >> timedout
> >> [   11.34] sunxi_nand 1c03000.nand: wait for empty cmd FIFO 
> >> timedout
> >> [   12.35] sunxi_nand 1c03000.nand: wait for empty cmd FIFO 
> >> timedout
> >> [   13.36] sunxi_nand 1c03000.nand: wait for empty cmd FIFO 
> >> timedout
> >> [   14.37] sunxi_nand 1c03000.nand: wait for empty cmd FIFO 
> >> timedout
> >> [   14.38] ubi0 warning: ubi_io_read: error -110 while reading 
> >> 4096
> >> bytes from PEB 1034:4096, read only 0 bytes, retry  
> > 
> > And suddenly you get timeouts. That's really weird.  
> 
> 
> Is there anything I can do on this end to help debug ?
> 
> 
> > 
> > 

Re: CHIPPro NAND issue with 4.12 rc1

2017-05-20 Thread Boris Brezillon
Le Sat, 20 May 2017 15:24:06 -0600,
Angus Ainslie  a écrit :

> On 2017-05-20 09:14, Boris Brezillon wrote:
> > Le Sat, 20 May 2017 08:49:04 -0600,
> > Angus Ainslie  a écrit :
> >   
> >> Hi All,
> >> 
> >> I'm trying to boot a CHIPPro with the stock 4.12 rc1 kernel. If I make
> >> no modifications to the sun5i-gr8-chip-pro.dtb the kernel boots but
> >> can't find the root partition.
> >> 
> >> So I added the partitions to the dts file
> >> 
> >> diff --git a/arch/arm/boot/dts/sun5i-gr8-chip-pro.dts
> >> b/arch/arm/boot/dts/sun5i-gr8-chip-pro.dts
> >> index c55b11a..0e61e6b 100644
> >> --- a/arch/arm/boot/dts/sun5i-gr8-chip-pro.dts
> >> +++ b/arch/arm/boot/dts/sun5i-gr8-chip-pro.dts
> >> @@ -146,6 +146,32 @@
> >>  reg = <0>;
> >>  allwinner,rb = <0>;
> >>  nand-ecc-mode = "hw";
> >> +nand-on-flash-bbt;
> >> +
> >> +spl@0 {
> >> +  label = "SPL";
> >> +  reg = /bits/ 64 <0x0 0x40>;
> >> +};
> >> +
> >> +spl-backup@40 {
> >> +  label = "SPL.backup";
> >> +  reg = /bits/ 64 <0x40 0x40>;
> >> +};
> >> +
> >> +u-boot@80 {
> >> +  label = "U-Boot";
> >> +  reg = /bits/ 64 <0x80 0x40>;
> >> +};
> >> +
> >> +env@c0 {
> >> +  label = "env";
> >> +  reg = /bits/ 64 <0xc0 0x40>;
> >> +};
> >> +
> >> +rootfs@100 {
> >> +  label = "rootfs";
> >> +  reg = /bits/ 64 <0x100 0x1f00>;
> >> +};
> >>  };
> >>   };
> >> 
> >> and now the kernel finds the partition but it times out trying to 
> >> mount
> >> it. It seems to be something in the dts files because if I use the
> >> ntc-gr8-crumb.dts from the ntc 4.4.30 kernel then the system boots all
> >> the way to userland.  
> > 
> > Hm, that's weird. Just changing the dtb makes it work? Did you try to
> > dump both dtbs and figure out what else changes?
> >   
> 
> Yeah I thought it was weird too. I was thinking that maybe the pin muxes 
> were getting changed and the rb net or the interrupt net was getting 
> changed to a different function.
> 
> I did decompile to 2 dtb's and I couldn't find many differences. They 
> were mostly around some pull ups and drive strength for some of the NAND 
> and i2c pins. I tried adding those changes and it still didn't work so I 
> went back to the minimal set of changes to reproduce the bug.
> 
> > Also, I wonder how the NAND is correctly detected without this patch
> > [1].
> >   
> 
> 
> That patch seems to be in my 4.12-rc1 kernel, I have a definition for 
> the TC58NVG2S0H.
> 
> >> 
> >> [7.13] ubi0: scanning is finished
> >> [7.15] ubi0: attached mtd4 (name "rootfs", size 496 MiB)
> >> [7.16] ubi0: PEB size: 262144 bytes (256 KiB), LEB size: 
> >> 258048
> >> bytes
> >> [7.17] ubi0: min./max. I/O unit sizes: 4096/4096, sub-page 
> >> size
> >> 1024
> >> [7.18] ubi0: VID header offset: 1024 (aligned 1024), data
> >> offset: 4096
> >> [7.19] ubi0: good PEBs: 1977, bad PEBs: 7, corrupted PEBs: 0
> >> [7.20] ubi0: user volume: 1, internal volumes: 1, max. volumes
> >> count: 128
> >> [7.21] ubi0: max/mean erase counter: 3/1, WL threshold: 4096,
> >> image sequence number: 177407
> >> [7.22] ubi0: available PEBs: 1, total reserved PEBs: 1976, 
> >> PEBs
> >> reserved for bad PEB handling: 33  
> > 
> > UBI attach works...
> >   
> >> [7.24] hctosys: unable to open rtc device (rtc0)
> >> [7.25] vcc3v0: disabling

Interestingly, it starts failing after the core disables all unused
regulators. Not sure this is related but that's worth having a look.

I looked at the schematics and it seems VCC-3V3 (which is powering the
NAND chip) is enabled with the EXTEN pin of the AXP209 [1]. I don't know
if this pin is controlled by Linux, but maybe you can dump register
0x12 and check if EXTEN is set to 1.

> >> [7.25] ALSA device list:
> >> [7.26]   #0: sun4i-codec
> >> [7.26] ubi0: background thread "ubi_bgt0d" started, PID 53
> >> [8.32] sunxi_nand 1c03000.nand: wait interrupt timedout
> >> [9.32] sunxi_nand 1c03000.nand: wait interrupt timedout
> >> [   10.33] sunxi_nand 1c03000.nand: wait for empty cmd FIFO 
> >> timedout
> >> [   11.34] sunxi_nand 1c03000.nand: wait for empty cmd FIFO 
> >> timedout
> >> [   12.35] sunxi_nand 1c03000.nand: wait for empty cmd FIFO 
> >> timedout
> >> [   13.36] sunxi_nand 1c03000.nand: wait for empty cmd FIFO 
> >> timedout
> >> [   14.37] sunxi_nand 1c03000.nand: wait for empty cmd FIFO 
> >> timedout
> >> [   14.38] ubi0 warning: ubi_io_read: error -110 while reading 
> >> 4096
> >> bytes from PEB 1034:4096, read only 0 bytes, retry  
> > 
> > And suddenly you get timeouts. That's really weird.  
> 
> 
> Is there anything I can do on this end to help debug ?
> 
> 
> > 
> > [1]https://github.com/NextThingCo/linux/commit/5ebc35ce1223ef14ace9479d5f97d0fce979e550
> >   
> 


Re: [PATCH 18/24] thunderbolt: Store Thunderbolt generation in the switch structure

2017-05-20 Thread Lukas Wunner
On Sun, May 21, 2017 at 05:29:47AM +, Levy, Amir (Jer) wrote:
> On Sun, May 21 2017, 07:47 AM, Lukas Wunner wrote:
> > On Thu, May 18, 2017 at 05:39:08PM +0300, Mika Westerberg wrote:
> > > +
> > > + default:
> > > + sw->generation = 1;
> > > + break;
> > 
> > If someone adds an entry for, say, a new TB3 controller to nhi_ids[] but
> > forgets to update this function, the controller is assigned the wrong
> > generation number.  It might be better to make TB3 the default and list
> > each TB1 controller instead since it's less likely for Intel to introduce
> > an older gen chip.
> > 
> > Generally I think it's problematic to require that multiple files are
> > touched whenever a new controller is added.  Isn't the generation number
> > or link speed (10/20/40) stored in some register in PCI config space
> > (VSEC 0x1234) or TB config space?
> 
> How about setting information, that isn't available from PCI, in
> pci_device_id.driver_data when initializing nhi_ids[]?

Right, that would also be possible, though reading the generation number
from a register would be more elegant, if such a register exists.

Thanks,

Lukas


Re: [PATCH 18/24] thunderbolt: Store Thunderbolt generation in the switch structure

2017-05-20 Thread Lukas Wunner
On Sun, May 21, 2017 at 05:29:47AM +, Levy, Amir (Jer) wrote:
> On Sun, May 21 2017, 07:47 AM, Lukas Wunner wrote:
> > On Thu, May 18, 2017 at 05:39:08PM +0300, Mika Westerberg wrote:
> > > +
> > > + default:
> > > + sw->generation = 1;
> > > + break;
> > 
> > If someone adds an entry for, say, a new TB3 controller to nhi_ids[] but
> > forgets to update this function, the controller is assigned the wrong
> > generation number.  It might be better to make TB3 the default and list
> > each TB1 controller instead since it's less likely for Intel to introduce
> > an older gen chip.
> > 
> > Generally I think it's problematic to require that multiple files are
> > touched whenever a new controller is added.  Isn't the generation number
> > or link speed (10/20/40) stored in some register in PCI config space
> > (VSEC 0x1234) or TB config space?
> 
> How about setting information, that isn't available from PCI, in
> pci_device_id.driver_data when initializing nhi_ids[]?

Right, that would also be possible, though reading the generation number
from a register would be more elegant, if such a register exists.

Thanks,

Lukas


Re: [PATCH 0/8] CaitSith LSM module

2017-05-20 Thread John Johansen
On 05/20/2017 09:59 PM, Tetsuo Handa wrote:
> John Johansen wrote:
>> On 11/22/2016 10:31 PM, Tetsuo Handa wrote:
>>> Tetsuo Handa wrote:
 John Johansen wrote:
>> In order to minimize the burden of reviewing, this patchset implements
>> only functionality of checking program execution requests (i.e. execve()
>> system call) using pathnames. I'm planning to add other functionalities
>> after this version got included into mainline. You can find how future
>> versions of CaitSith will look like at http://caitsith.osdn.jp/ .
>>
> Thanks I've started working my way through this, but it is going to take
> me a while.
>

 Thank you for your time.
>>>
>>> May I hear the status? Is there something I can do other than waiting?
>>>
>> progressing very slowly, I have some time over the next few days as its a
>> long weekend here in the US some hopefully I can finish this up
>>
> 
> May I hear the status again?
> 
Yes, sorry. I just haven't had time too look at it recently. I am sorry that
it has been so long. I am just going to have to book a day off and do it. I'll
see if I can't get a day next week (getting late but I can try or the following)


> 
> 
> On 5th March 2017, a CTF game was held in an event titled
> "CyberColosseo x SecCon" ( http://2016.seccon.jp/news/#137 ). I gave a
> simple troubleshooting-like system-analyzing quiz using SSH shell session
> where operations are restricted by CaitSith.
> 
> Since the VM will be useful as an example of how to configure
> CaitSith's policy configuration, I made a downloadable version.
> 
>   
> http://osdn.jp/frs/redir.php?m=jaist=/caitsith/67303/SecCon20170305-CaitSith.zip
>   MD5: 99bad6936d8cdeb37d0d6af99265a2ac
> 
> This VM is configured for VMware Player 12 / 4 CPUs / 2048MB RAM.
> An IPv4 address will be assigned upon boot using DHCP service on the host 
> network.
> SSH username and password are both "caitsith".
> 



Re: [PATCH 0/8] CaitSith LSM module

2017-05-20 Thread John Johansen
On 05/20/2017 09:59 PM, Tetsuo Handa wrote:
> John Johansen wrote:
>> On 11/22/2016 10:31 PM, Tetsuo Handa wrote:
>>> Tetsuo Handa wrote:
 John Johansen wrote:
>> In order to minimize the burden of reviewing, this patchset implements
>> only functionality of checking program execution requests (i.e. execve()
>> system call) using pathnames. I'm planning to add other functionalities
>> after this version got included into mainline. You can find how future
>> versions of CaitSith will look like at http://caitsith.osdn.jp/ .
>>
> Thanks I've started working my way through this, but it is going to take
> me a while.
>

 Thank you for your time.
>>>
>>> May I hear the status? Is there something I can do other than waiting?
>>>
>> progressing very slowly, I have some time over the next few days as its a
>> long weekend here in the US some hopefully I can finish this up
>>
> 
> May I hear the status again?
> 
Yes, sorry. I just haven't had time too look at it recently. I am sorry that
it has been so long. I am just going to have to book a day off and do it. I'll
see if I can't get a day next week (getting late but I can try or the following)


> 
> 
> On 5th March 2017, a CTF game was held in an event titled
> "CyberColosseo x SecCon" ( http://2016.seccon.jp/news/#137 ). I gave a
> simple troubleshooting-like system-analyzing quiz using SSH shell session
> where operations are restricted by CaitSith.
> 
> Since the VM will be useful as an example of how to configure
> CaitSith's policy configuration, I made a downloadable version.
> 
>   
> http://osdn.jp/frs/redir.php?m=jaist=/caitsith/67303/SecCon20170305-CaitSith.zip
>   MD5: 99bad6936d8cdeb37d0d6af99265a2ac
> 
> This VM is configured for VMware Player 12 / 4 CPUs / 2048MB RAM.
> An IPv4 address will be assigned upon boot using DHCP service on the host 
> network.
> SSH username and password are both "caitsith".
> 



Re: [PATCH 10/24] thunderbolt: Read vendor and device name from DROM

2017-05-20 Thread Lukas Wunner
On Fri, May 19, 2017 at 01:28:36PM +0300, Mika Westerberg wrote:
> On Fri, May 19, 2017 at 12:07:10PM +0200, Lukas Wunner wrote:
> > Apple uses 0x30 to store a
> > serial number.  Is this attribute number assigned by Intel to Apple
> > or is it reserved for vendor use or did they arbitrarily choose it?
> 
> It is part of the DROM specification. The 0x30 - 0x3e are vendor
> specific entries.

Ah, so I have to qualify the vendor number with Apple's ID before I know
that it's a serial number.  Thanks.


> > If there can be many attributes, should they be stored in a list
> > rather than adding a char* pointer for each one to struct tb_switch?
> > The latter doesn't scale.
> 
> I don't think we need other attributes (well, at least right now). The
> device/vendor name is useful because that's what we expose to the
> userspace for device identification along with the device/vendor ID.

Okay.  It might be worth to log additional attributes with info level.


> > > +static void tb_drom_parse_generic_entry(struct tb_switch *sw,
> > > + struct tb_drom_entry_generic *entry)
> > > +{
> > > + if (entry->header.index == 1)
> > > + sw->vendor_name = kstrdup((char *)entry->data, GFP_KERNEL);
> > > + else if (entry->header.index == 2)
> > > + sw->device_name = kstrdup((char *)entry->data, GFP_KERNEL);
> > > +}
> > 
> > This assumes that these are properly null-terminated strings, but the DROM
> > may contain complete garbage.  The existing drom parser is very careful
> > to validate and sanitize everything.
> 
> The DROM specification says they must be null-terminated but I yes, it
> is possible that some of the devices have it wrong. The generic entry
> includes length field so I suppose we can use that + kmemdup() instead
> here?

Yes, as long as you check that the last character is null.

Thanks,

Lukas


Re: [PATCH 10/24] thunderbolt: Read vendor and device name from DROM

2017-05-20 Thread Lukas Wunner
On Fri, May 19, 2017 at 01:28:36PM +0300, Mika Westerberg wrote:
> On Fri, May 19, 2017 at 12:07:10PM +0200, Lukas Wunner wrote:
> > Apple uses 0x30 to store a
> > serial number.  Is this attribute number assigned by Intel to Apple
> > or is it reserved for vendor use or did they arbitrarily choose it?
> 
> It is part of the DROM specification. The 0x30 - 0x3e are vendor
> specific entries.

Ah, so I have to qualify the vendor number with Apple's ID before I know
that it's a serial number.  Thanks.


> > If there can be many attributes, should they be stored in a list
> > rather than adding a char* pointer for each one to struct tb_switch?
> > The latter doesn't scale.
> 
> I don't think we need other attributes (well, at least right now). The
> device/vendor name is useful because that's what we expose to the
> userspace for device identification along with the device/vendor ID.

Okay.  It might be worth to log additional attributes with info level.


> > > +static void tb_drom_parse_generic_entry(struct tb_switch *sw,
> > > + struct tb_drom_entry_generic *entry)
> > > +{
> > > + if (entry->header.index == 1)
> > > + sw->vendor_name = kstrdup((char *)entry->data, GFP_KERNEL);
> > > + else if (entry->header.index == 2)
> > > + sw->device_name = kstrdup((char *)entry->data, GFP_KERNEL);
> > > +}
> > 
> > This assumes that these are properly null-terminated strings, but the DROM
> > may contain complete garbage.  The existing drom parser is very careful
> > to validate and sanitize everything.
> 
> The DROM specification says they must be null-terminated but I yes, it
> is possible that some of the devices have it wrong. The generic entry
> includes length field so I suppose we can use that + kmemdup() instead
> here?

Yes, as long as you check that the last character is null.

Thanks,

Lukas


RE: [PATCH 18/24] thunderbolt: Store Thunderbolt generation in the switch structure

2017-05-20 Thread Levy, Amir (Jer)
On Sun, May 21 2017, 07:47 AM, Lukas Wunner wrote:
> On Thu, May 18, 2017 at 05:39:08PM +0300, Mika Westerberg wrote:
> > +
> > +   default:
> > +   sw->generation = 1;
> > +   break;
> 
> If someone adds an entry for, say, a new TB3 controller to nhi_ids[] but 
> forgets
> to update this function, the controller is assigned the wrong generation
> number.  It might be better to make TB3 the default and list each TB1
> controller instead since it's less likely for Intel to introduce an older gen 
> chip.
> 
> Generally I think it's problematic to require that multiple files are touched
> whenever a new controller is added.  Isn't the generation number or link speed
> (10/20/40) stored in some register in PCI config space (VSEC 0x1234) or TB
> config space?
> 
> Thanks,
> 
> Lukas
> 

How about setting information, that isn't available from PCI, in 
pci_device_id.driver_data when initializing nhi_ids[]?


RE: [PATCH 18/24] thunderbolt: Store Thunderbolt generation in the switch structure

2017-05-20 Thread Levy, Amir (Jer)
On Sun, May 21 2017, 07:47 AM, Lukas Wunner wrote:
> On Thu, May 18, 2017 at 05:39:08PM +0300, Mika Westerberg wrote:
> > +
> > +   default:
> > +   sw->generation = 1;
> > +   break;
> 
> If someone adds an entry for, say, a new TB3 controller to nhi_ids[] but 
> forgets
> to update this function, the controller is assigned the wrong generation
> number.  It might be better to make TB3 the default and list each TB1
> controller instead since it's less likely for Intel to introduce an older gen 
> chip.
> 
> Generally I think it's problematic to require that multiple files are touched
> whenever a new controller is added.  Isn't the generation number or link speed
> (10/20/40) stored in some register in PCI config space (VSEC 0x1234) or TB
> config space?
> 
> Thanks,
> 
> Lukas
> 

How about setting information, that isn't available from PCI, in 
pci_device_id.driver_data when initializing nhi_ids[]?


Re: linux-next 20170519 - semaphores broken

2017-05-20 Thread Kees Cook
On Sat, May 20, 2017 at 1:18 PM,   wrote:
> Seeing problems with programs that use semaphores.  The one
> that I'm getting bit by is jackd.  strace says:
>
> getuid()= 967
> semget(0x282929, 0, 000)= 229376
> semop(229376, [{0, -1, SEM_UNDO}], 1)   = -1 EIDRM (Identifier removed)
> write(2, "JACK semaphore error: semop (Ide"..., 49JACK semaphore error: semop 
> (Identifier removed)
> ) = 49
>
> Bisects down to this commit, and reverting it from 20170519 makes things work
> again.  No idea why this causes indigestion, there's probably something subtly
> wrong here
>
> commit 337f43326737b5eb28eb13f43c27a5788da0f913
> Author: Manfred Spraul 
> Date:   Fri May 19 07:39:23 2017 +1000
>
> ipc: merge ipc_rcu and kern_ipc_perm
>
> ipc has two management structures that exist for every id:
> - struct kern_ipc_perm, it contains e.g. the permissions.
> - struct ipc_rcu, it contains the rcu head for rcu handling and
>   the refcount.

I think I found the cause of this. Prior to this change, the RCU (with
refcount) is located ahead of the struct sem_array. After this change,
the RCU and refcount is within it, so this is happening:

sma = container_of(ipc_rcu_alloc(size), struct sem_array, sem_perm);
if (!sma)
return -ENOMEM;

memset(sma, 0, size);

ipc_rcu_alloc() initializes the refcount to 1, and the memset bumps it
back to zero.

A work-around would be to wrap the memset() like this:

struct ipc_kern_perm perm;
...
perm = sma->sem_perm;
memset(sma, 0, size);
sma->sem_perm = perm;

I actually have a series that changes things much more, and moves the
refcount set to ipc_addid() which is the only place it needs to happen
(though this requires fixing up the mistaken rcu freeing on error
paths). Here's the lightly tested series, on top of -next:

https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/log/?h=kspp/gcc-plugin/randstruct-next-20170519-ipc

Manfred, I think I could get to the same results in fewer logical
steps, but I'm curious to see what you think of what I've got first.
Mainly I've done the following things:

- remove unneeded RCU free calls (since the IPC memory is only exposed
to RCUness once ipc_addid() succeeds
- move refcount init into ipc_addid() since it only needs to be
initialized from that point on
- remove utility allocators since now nothing special needs to be done
in the general case
- result is no requirement of ipc_kern_perms position in ipc
structures and cleaner code, IMO

-Kees

-- 
Kees Cook
Pixel Security


Re: linux-next 20170519 - semaphores broken

2017-05-20 Thread Kees Cook
On Sat, May 20, 2017 at 1:18 PM,   wrote:
> Seeing problems with programs that use semaphores.  The one
> that I'm getting bit by is jackd.  strace says:
>
> getuid()= 967
> semget(0x282929, 0, 000)= 229376
> semop(229376, [{0, -1, SEM_UNDO}], 1)   = -1 EIDRM (Identifier removed)
> write(2, "JACK semaphore error: semop (Ide"..., 49JACK semaphore error: semop 
> (Identifier removed)
> ) = 49
>
> Bisects down to this commit, and reverting it from 20170519 makes things work
> again.  No idea why this causes indigestion, there's probably something subtly
> wrong here
>
> commit 337f43326737b5eb28eb13f43c27a5788da0f913
> Author: Manfred Spraul 
> Date:   Fri May 19 07:39:23 2017 +1000
>
> ipc: merge ipc_rcu and kern_ipc_perm
>
> ipc has two management structures that exist for every id:
> - struct kern_ipc_perm, it contains e.g. the permissions.
> - struct ipc_rcu, it contains the rcu head for rcu handling and
>   the refcount.

I think I found the cause of this. Prior to this change, the RCU (with
refcount) is located ahead of the struct sem_array. After this change,
the RCU and refcount is within it, so this is happening:

sma = container_of(ipc_rcu_alloc(size), struct sem_array, sem_perm);
if (!sma)
return -ENOMEM;

memset(sma, 0, size);

ipc_rcu_alloc() initializes the refcount to 1, and the memset bumps it
back to zero.

A work-around would be to wrap the memset() like this:

struct ipc_kern_perm perm;
...
perm = sma->sem_perm;
memset(sma, 0, size);
sma->sem_perm = perm;

I actually have a series that changes things much more, and moves the
refcount set to ipc_addid() which is the only place it needs to happen
(though this requires fixing up the mistaken rcu freeing on error
paths). Here's the lightly tested series, on top of -next:

https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/log/?h=kspp/gcc-plugin/randstruct-next-20170519-ipc

Manfred, I think I could get to the same results in fewer logical
steps, but I'm curious to see what you think of what I've got first.
Mainly I've done the following things:

- remove unneeded RCU free calls (since the IPC memory is only exposed
to RCUness once ipc_addid() succeeds
- move refcount init into ipc_addid() since it only needs to be
initialized from that point on
- remove utility allocators since now nothing special needs to be done
in the general case
- result is no requirement of ipc_kern_perms position in ipc
structures and cleaner code, IMO

-Kees

-- 
Kees Cook
Pixel Security


Re: [PATCH 0/8] CaitSith LSM module

2017-05-20 Thread Tetsuo Handa
John Johansen wrote:
> On 11/22/2016 10:31 PM, Tetsuo Handa wrote:
> > Tetsuo Handa wrote:
> >> John Johansen wrote:
>  In order to minimize the burden of reviewing, this patchset implements
>  only functionality of checking program execution requests (i.e. execve()
>  system call) using pathnames. I'm planning to add other functionalities
>  after this version got included into mainline. You can find how future
>  versions of CaitSith will look like at http://caitsith.osdn.jp/ .
> 
> >>> Thanks I've started working my way through this, but it is going to take
> >>> me a while.
> >>>
> >>
> >> Thank you for your time.
> > 
> > May I hear the status? Is there something I can do other than waiting?
> > 
> progressing very slowly, I have some time over the next few days as its a
> long weekend here in the US some hopefully I can finish this up
> 

May I hear the status again?



On 5th March 2017, a CTF game was held in an event titled
"CyberColosseo x SecCon" ( http://2016.seccon.jp/news/#137 ). I gave a
simple troubleshooting-like system-analyzing quiz using SSH shell session
where operations are restricted by CaitSith.

Since the VM will be useful as an example of how to configure
CaitSith's policy configuration, I made a downloadable version.

  
http://osdn.jp/frs/redir.php?m=jaist=/caitsith/67303/SecCon20170305-CaitSith.zip
  MD5: 99bad6936d8cdeb37d0d6af99265a2ac

This VM is configured for VMware Player 12 / 4 CPUs / 2048MB RAM.
An IPv4 address will be assigned upon boot using DHCP service on the host 
network.
SSH username and password are both "caitsith".


Re: [PATCH 0/8] CaitSith LSM module

2017-05-20 Thread Tetsuo Handa
John Johansen wrote:
> On 11/22/2016 10:31 PM, Tetsuo Handa wrote:
> > Tetsuo Handa wrote:
> >> John Johansen wrote:
>  In order to minimize the burden of reviewing, this patchset implements
>  only functionality of checking program execution requests (i.e. execve()
>  system call) using pathnames. I'm planning to add other functionalities
>  after this version got included into mainline. You can find how future
>  versions of CaitSith will look like at http://caitsith.osdn.jp/ .
> 
> >>> Thanks I've started working my way through this, but it is going to take
> >>> me a while.
> >>>
> >>
> >> Thank you for your time.
> > 
> > May I hear the status? Is there something I can do other than waiting?
> > 
> progressing very slowly, I have some time over the next few days as its a
> long weekend here in the US some hopefully I can finish this up
> 

May I hear the status again?



On 5th March 2017, a CTF game was held in an event titled
"CyberColosseo x SecCon" ( http://2016.seccon.jp/news/#137 ). I gave a
simple troubleshooting-like system-analyzing quiz using SSH shell session
where operations are restricted by CaitSith.

Since the VM will be useful as an example of how to configure
CaitSith's policy configuration, I made a downloadable version.

  
http://osdn.jp/frs/redir.php?m=jaist=/caitsith/67303/SecCon20170305-CaitSith.zip
  MD5: 99bad6936d8cdeb37d0d6af99265a2ac

This VM is configured for VMware Player 12 / 4 CPUs / 2048MB RAM.
An IPv4 address will be assigned upon boot using DHCP service on the host 
network.
SSH username and password are both "caitsith".


Re: [PATCH 18/24] thunderbolt: Store Thunderbolt generation in the switch structure

2017-05-20 Thread Lukas Wunner
On Thu, May 18, 2017 at 05:39:08PM +0300, Mika Westerberg wrote:
> In some cases it is useful to know what is the Thunderbolt generation
> the switch supports. This introduces a new field to struct switch that
> stores the generation of the switch based on the device ID.
> 
> Signed-off-by: Mika Westerberg 
> Reviewed-by: Yehezkel Bernat 
> Reviewed-by: Michael Jamet 
> ---
>  drivers/thunderbolt/switch.c | 24 
>  drivers/thunderbolt/tb.h |  2 ++
>  2 files changed, 26 insertions(+)
> 
> diff --git a/drivers/thunderbolt/switch.c b/drivers/thunderbolt/switch.c
> index 396e00ab7723..9c91d397d3b3 100644
> --- a/drivers/thunderbolt/switch.c
> +++ b/drivers/thunderbolt/switch.c
> @@ -382,6 +382,28 @@ struct device_type tb_switch_type = {
>   .release = tb_switch_release,
>  };
>  
> +static void tb_switch_set_generation(struct tb_switch *sw)
> +{
> + switch (sw->config.device_id) {
> + case PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_LP_BRIDGE:
> + case PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_2C_BRIDGE:
> + case PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_4C_BRIDGE:
> + case PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_C_2C_BRIDGE:
> + case PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_C_4C_BRIDGE:
> + sw->generation = 3;
> + break;
> +
> + case PCI_DEVICE_ID_INTEL_FALCON_RIDGE_2C_BRIDGE:
> + case PCI_DEVICE_ID_INTEL_FALCON_RIDGE_4C_BRIDGE:
> + sw->generation = 2;
> + break;

This is missing PCI_DEVICE_ID_INTEL_WIN_RIDGE_2C_BRIDGE (0x157e).


> +
> + default:
> + sw->generation = 1;
> + break;

If someone adds an entry for, say, a new TB3 controller to nhi_ids[]
but forgets to update this function, the controller is assigned the
wrong generation number.  It might be better to make TB3 the default
and list each TB1 controller instead since it's less likely for Intel
to introduce an older gen chip.

Generally I think it's problematic to require that multiple files
are touched whenever a new controller is added.  Isn't the generation
number or link speed (10/20/40) stored in some register in PCI config
space (VSEC 0x1234) or TB config space?

Thanks,

Lukas

> + }
> +}
> +
>  /**
>   * tb_switch_alloc() - allocate a switch
>   * @tb: Pointer to the owning domain
> @@ -442,6 +464,8 @@ struct tb_switch *tb_switch_alloc(struct tb *tb, struct 
> device *parent,
>   }
>   sw->cap_plug_events = cap;
>  
> + tb_switch_set_generation(sw);
> +
>   device_initialize(>dev);
>   sw->dev.parent = parent;
>   sw->dev.bus = _bus_type;
> diff --git a/drivers/thunderbolt/tb.h b/drivers/thunderbolt/tb.h
> index 0be989069941..b3cda7605619 100644
> --- a/drivers/thunderbolt/tb.h
> +++ b/drivers/thunderbolt/tb.h
> @@ -25,6 +25,7 @@
>   * @device: Device ID of the switch
>   * @vendor_name: Name of the vendor (or %NULL if not known)
>   * @device_name: Name of the device (or %NULL if not known)
> + * @generation: Switch Thunderbolt generation
>   * @cap_plug_events: Offset to the plug events capability (%0 if not found)
>   * @is_unplugged: The switch is going away
>   * @drom: DROM of the switch (%NULL if not found)
> @@ -40,6 +41,7 @@ struct tb_switch {
>   u16 device;
>   const char *vendor_name;
>   const char *device_name;
> + unsigned int generation;
>   int cap_plug_events;
>   bool is_unplugged;
>   u8 *drom;
> -- 
> 2.11.0
> 


Re: [PATCH 18/24] thunderbolt: Store Thunderbolt generation in the switch structure

2017-05-20 Thread Lukas Wunner
On Thu, May 18, 2017 at 05:39:08PM +0300, Mika Westerberg wrote:
> In some cases it is useful to know what is the Thunderbolt generation
> the switch supports. This introduces a new field to struct switch that
> stores the generation of the switch based on the device ID.
> 
> Signed-off-by: Mika Westerberg 
> Reviewed-by: Yehezkel Bernat 
> Reviewed-by: Michael Jamet 
> ---
>  drivers/thunderbolt/switch.c | 24 
>  drivers/thunderbolt/tb.h |  2 ++
>  2 files changed, 26 insertions(+)
> 
> diff --git a/drivers/thunderbolt/switch.c b/drivers/thunderbolt/switch.c
> index 396e00ab7723..9c91d397d3b3 100644
> --- a/drivers/thunderbolt/switch.c
> +++ b/drivers/thunderbolt/switch.c
> @@ -382,6 +382,28 @@ struct device_type tb_switch_type = {
>   .release = tb_switch_release,
>  };
>  
> +static void tb_switch_set_generation(struct tb_switch *sw)
> +{
> + switch (sw->config.device_id) {
> + case PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_LP_BRIDGE:
> + case PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_2C_BRIDGE:
> + case PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_4C_BRIDGE:
> + case PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_C_2C_BRIDGE:
> + case PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_C_4C_BRIDGE:
> + sw->generation = 3;
> + break;
> +
> + case PCI_DEVICE_ID_INTEL_FALCON_RIDGE_2C_BRIDGE:
> + case PCI_DEVICE_ID_INTEL_FALCON_RIDGE_4C_BRIDGE:
> + sw->generation = 2;
> + break;

This is missing PCI_DEVICE_ID_INTEL_WIN_RIDGE_2C_BRIDGE (0x157e).


> +
> + default:
> + sw->generation = 1;
> + break;

If someone adds an entry for, say, a new TB3 controller to nhi_ids[]
but forgets to update this function, the controller is assigned the
wrong generation number.  It might be better to make TB3 the default
and list each TB1 controller instead since it's less likely for Intel
to introduce an older gen chip.

Generally I think it's problematic to require that multiple files
are touched whenever a new controller is added.  Isn't the generation
number or link speed (10/20/40) stored in some register in PCI config
space (VSEC 0x1234) or TB config space?

Thanks,

Lukas

> + }
> +}
> +
>  /**
>   * tb_switch_alloc() - allocate a switch
>   * @tb: Pointer to the owning domain
> @@ -442,6 +464,8 @@ struct tb_switch *tb_switch_alloc(struct tb *tb, struct 
> device *parent,
>   }
>   sw->cap_plug_events = cap;
>  
> + tb_switch_set_generation(sw);
> +
>   device_initialize(>dev);
>   sw->dev.parent = parent;
>   sw->dev.bus = _bus_type;
> diff --git a/drivers/thunderbolt/tb.h b/drivers/thunderbolt/tb.h
> index 0be989069941..b3cda7605619 100644
> --- a/drivers/thunderbolt/tb.h
> +++ b/drivers/thunderbolt/tb.h
> @@ -25,6 +25,7 @@
>   * @device: Device ID of the switch
>   * @vendor_name: Name of the vendor (or %NULL if not known)
>   * @device_name: Name of the device (or %NULL if not known)
> + * @generation: Switch Thunderbolt generation
>   * @cap_plug_events: Offset to the plug events capability (%0 if not found)
>   * @is_unplugged: The switch is going away
>   * @drom: DROM of the switch (%NULL if not found)
> @@ -40,6 +41,7 @@ struct tb_switch {
>   u16 device;
>   const char *vendor_name;
>   const char *device_name;
> + unsigned int generation;
>   int cap_plug_events;
>   bool is_unplugged;
>   u8 *drom;
> -- 
> 2.11.0
> 


[PATCH] drm/i915: mark wait_for_engine() __maybe_unused

2017-05-20 Thread Nick Desaulniers
This solves a warning when compiling the driver with Clang, -Werror enabled,
and CONFIG_DRM_I915_DEBUG_GEM unset, since Clang warns that:

drivers/gpu/drm/i915/i915_gem.c:3274:12: error: function
'wait_for_engine' is not needed
  and will not be emitted [-Werror,-Wunneeded-internal-declaration]
static int wait_for_engine(struct intel_engine_cs *engine, int
timeout_ms)
   ^

Signed-off-by: Nick Desaulniers 
---
Additionally, it only has one call site. Should I mark it inline, too, while
I'm at it?

 drivers/gpu/drm/i915/i915_gem.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index b6ac3df18b58..73b82fb94b0e 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3271,7 +3271,8 @@ static int wait_for_timeline(struct i915_gem_timeline 
*tl, unsigned int flags)
return 0;
 }
 
-static int wait_for_engine(struct intel_engine_cs *engine, int timeout_ms)
+static __maybe_unused int wait_for_engine(
+   struct intel_engine_cs *engine, int timeout_ms)
 {
return wait_for(intel_engine_is_idle(engine), timeout_ms);
 }
-- 
2.11.0



[PATCH] drm/i915: mark wait_for_engine() __maybe_unused

2017-05-20 Thread Nick Desaulniers
This solves a warning when compiling the driver with Clang, -Werror enabled,
and CONFIG_DRM_I915_DEBUG_GEM unset, since Clang warns that:

drivers/gpu/drm/i915/i915_gem.c:3274:12: error: function
'wait_for_engine' is not needed
  and will not be emitted [-Werror,-Wunneeded-internal-declaration]
static int wait_for_engine(struct intel_engine_cs *engine, int
timeout_ms)
   ^

Signed-off-by: Nick Desaulniers 
---
Additionally, it only has one call site. Should I mark it inline, too, while
I'm at it?

 drivers/gpu/drm/i915/i915_gem.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index b6ac3df18b58..73b82fb94b0e 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3271,7 +3271,8 @@ static int wait_for_timeline(struct i915_gem_timeline 
*tl, unsigned int flags)
return 0;
 }
 
-static int wait_for_engine(struct intel_engine_cs *engine, int timeout_ms)
+static __maybe_unused int wait_for_engine(
+   struct intel_engine_cs *engine, int timeout_ms)
 {
return wait_for(intel_engine_is_idle(engine), timeout_ms);
 }
-- 
2.11.0



will

2017-05-20 Thread Trustees



--
My previous messages were returned as undelivered. I am making a final 
attempt to reach you regarding the estate of Late George Brumley, you 
have made one of the beneficiaries. Please get back to me at your

earliest convenience.

Regards,

Jessica Simpson
Board Of Trustees


will

2017-05-20 Thread Trustees



--
My previous messages were returned as undelivered. I am making a final 
attempt to reach you regarding the estate of Late George Brumley, you 
have made one of the beneficiaries. Please get back to me at your

earliest convenience.

Regards,

Jessica Simpson
Board Of Trustees


[PATCH v2 2/2] dt-bindings: phy: Add documentation for Mediatek PCIe PHY

2017-05-20 Thread Ryder Lee
Add documentation for PCIe PHY available in MT7623 series SoCs.

Signed-off-by: Ryder Lee 
Acked-by: Rob Herring 
---
 .../devicetree/bindings/phy/phy-mt7623-pcie.txt| 63 ++
 1 file changed, 63 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/phy/phy-mt7623-pcie.txt

diff --git a/Documentation/devicetree/bindings/phy/phy-mt7623-pcie.txt 
b/Documentation/devicetree/bindings/phy/phy-mt7623-pcie.txt
new file mode 100644
index 000..6fefac5
--- /dev/null
+++ b/Documentation/devicetree/bindings/phy/phy-mt7623-pcie.txt
@@ -0,0 +1,63 @@
+Mediatek MT7623 PCIe PHY
+---
+
+Required properties:
+ - compatible: Should contain "mediatek,mt7623-pcie-phy".
+ - reg: Base address and length of the registers.
+ - clocks: Must contain an entry in clock-names.
+   See ../clocks/clock-bindings.txt for details.
+ - clock-names: Must be "pciephya_ref"
+ - #phy-cells: Must be 0.
+
+Optional properties:
+ - mediatek,phy-switch: A phandle to the system controller, used to
+   switch the PHY on PCIe port2 which is shared with USB u3phy2.
+
+Example:
+
+   pcie0_phy: pcie-phy@1a149000 {
+   compatible = "mediatek,mt7623-pcie-phy";
+   reg = <0 0x1a149000 0 0x1000>;
+   clocks = <>;
+   clock-names = "pciephya_ref";
+   #phy-cells = <0>;
+   };
+
+   pcie1_phy: pcie-phy@1a14a000 {
+   compatible = "mediatek,mt7623-pcie-phy";
+   reg = <0 0x1a14a000 0 0x1000>;
+   clocks = <>;
+   clock-names = "pciephya_ref";
+   #phy-cells = <0>;
+   };
+
+   pcie2_phy: pcie-phy@1a244000 {
+   compatible = "mediatek,mt7623-pcie-phy";
+   reg = <0 0x1a244000 0 0x1000>;
+   clocks = <>;
+   clock-names = "pciephya_ref";
+   #phy-cells = <0>;
+
+   mediatek,phy-switch = <>;
+   };
+
+Specifying phy control of devices
+-
+
+Device nodes should specify the configuration required in their "phys"
+property, containing a phandle to the phy node and phy-names.
+
+Example:
+
+#include 
+
+pcie: pcie@1a14 {
+   ...
+   pcie@0,0 {
+   ...
+   phys = <_phy>;
+   phy-names = "pcie-phy0";
+   }
+   ...
+};
+
-- 
1.9.1



Re: [PATCH 1/2] libsas: Don't process sas events in static works

2017-05-20 Thread Dan Williams
On Fri, May 19, 2017 at 11:39 PM, Yijing Wang  wrote:
> Now libsas hotplug work is static, LLDD driver queue
> the hotplug work into shost->work_q. If LLDD driver
> burst post lots hotplug events to libsas, the hotplug
> events may pending in the workqueue like
>
> shost->work_q
> new work[PORTE_BYTES_DMAED] --> |[PHYE_LOSS_OF_SIGNAL][PORTE_BYTES_DMAED] -> 
> processing
> |<---wait worker to process>|
> In this case, a new PORTE_BYTES_DMAED event coming, libsas try to queue it
> to shost->work_q, but this work is already pending, so it would be lost.
> Finally, libsas delete the related sas port and sas devices, but LLDD driver
> expect libsas add the sas port and devices(last sas event).
>
> This patch remove the static defined hotplug work, and use dynamic work to
> avoid missing hotplug events.

If we go this route we don't even need:

sas_port_event_fns
sas_phy_event_fns
sas_ha_event_fns

...just specify the target routine directly to INIT_WORK() and remove
the indirection.

I also think for safety this should use a mempool that guarantees that
events can continue to be processed under system memory pressure.
Also, have you considered the case when a broken phy starts throwing a
constant stream of events? Is there a point at which libsas should
stop queuing events and disable the phy?


[PATCH v2 2/2] dt-bindings: phy: Add documentation for Mediatek PCIe PHY

2017-05-20 Thread Ryder Lee
Add documentation for PCIe PHY available in MT7623 series SoCs.

Signed-off-by: Ryder Lee 
Acked-by: Rob Herring 
---
 .../devicetree/bindings/phy/phy-mt7623-pcie.txt| 63 ++
 1 file changed, 63 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/phy/phy-mt7623-pcie.txt

diff --git a/Documentation/devicetree/bindings/phy/phy-mt7623-pcie.txt 
b/Documentation/devicetree/bindings/phy/phy-mt7623-pcie.txt
new file mode 100644
index 000..6fefac5
--- /dev/null
+++ b/Documentation/devicetree/bindings/phy/phy-mt7623-pcie.txt
@@ -0,0 +1,63 @@
+Mediatek MT7623 PCIe PHY
+---
+
+Required properties:
+ - compatible: Should contain "mediatek,mt7623-pcie-phy".
+ - reg: Base address and length of the registers.
+ - clocks: Must contain an entry in clock-names.
+   See ../clocks/clock-bindings.txt for details.
+ - clock-names: Must be "pciephya_ref"
+ - #phy-cells: Must be 0.
+
+Optional properties:
+ - mediatek,phy-switch: A phandle to the system controller, used to
+   switch the PHY on PCIe port2 which is shared with USB u3phy2.
+
+Example:
+
+   pcie0_phy: pcie-phy@1a149000 {
+   compatible = "mediatek,mt7623-pcie-phy";
+   reg = <0 0x1a149000 0 0x1000>;
+   clocks = <>;
+   clock-names = "pciephya_ref";
+   #phy-cells = <0>;
+   };
+
+   pcie1_phy: pcie-phy@1a14a000 {
+   compatible = "mediatek,mt7623-pcie-phy";
+   reg = <0 0x1a14a000 0 0x1000>;
+   clocks = <>;
+   clock-names = "pciephya_ref";
+   #phy-cells = <0>;
+   };
+
+   pcie2_phy: pcie-phy@1a244000 {
+   compatible = "mediatek,mt7623-pcie-phy";
+   reg = <0 0x1a244000 0 0x1000>;
+   clocks = <>;
+   clock-names = "pciephya_ref";
+   #phy-cells = <0>;
+
+   mediatek,phy-switch = <>;
+   };
+
+Specifying phy control of devices
+-
+
+Device nodes should specify the configuration required in their "phys"
+property, containing a phandle to the phy node and phy-names.
+
+Example:
+
+#include 
+
+pcie: pcie@1a14 {
+   ...
+   pcie@0,0 {
+   ...
+   phys = <_phy>;
+   phy-names = "pcie-phy0";
+   }
+   ...
+};
+
-- 
1.9.1



Re: [PATCH 1/2] libsas: Don't process sas events in static works

2017-05-20 Thread Dan Williams
On Fri, May 19, 2017 at 11:39 PM, Yijing Wang  wrote:
> Now libsas hotplug work is static, LLDD driver queue
> the hotplug work into shost->work_q. If LLDD driver
> burst post lots hotplug events to libsas, the hotplug
> events may pending in the workqueue like
>
> shost->work_q
> new work[PORTE_BYTES_DMAED] --> |[PHYE_LOSS_OF_SIGNAL][PORTE_BYTES_DMAED] -> 
> processing
> |<---wait worker to process>|
> In this case, a new PORTE_BYTES_DMAED event coming, libsas try to queue it
> to shost->work_q, but this work is already pending, so it would be lost.
> Finally, libsas delete the related sas port and sas devices, but LLDD driver
> expect libsas add the sas port and devices(last sas event).
>
> This patch remove the static defined hotplug work, and use dynamic work to
> avoid missing hotplug events.

If we go this route we don't even need:

sas_port_event_fns
sas_phy_event_fns
sas_ha_event_fns

...just specify the target routine directly to INIT_WORK() and remove
the indirection.

I also think for safety this should use a mempool that guarantees that
events can continue to be processed under system memory pressure.
Also, have you considered the case when a broken phy starts throwing a
constant stream of events? Is there a point at which libsas should
stop queuing events and disable the phy?


[PATCH v2 1/2] phy: add PCIe PHY driver for mt7623 SoCs families

2017-05-20 Thread Ryder Lee
support PCIe PHY of MT7623 SoCs families

Signed-off-by: Ryder Lee 
---
 drivers/phy/Kconfig   |   8 ++
 drivers/phy/Makefile  |   1 +
 drivers/phy/phy-mt7623-pcie.c | 290 ++
 3 files changed, 299 insertions(+)
 create mode 100644 drivers/phy/phy-mt7623-pcie.c

diff --git a/drivers/phy/Kconfig b/drivers/phy/Kconfig
index afaf7b6..94b3e61 100644
--- a/drivers/phy/Kconfig
+++ b/drivers/phy/Kconfig
@@ -530,4 +530,12 @@ config PHY_MESON8B_USB2
  and GXBB SoCs.
  If unsure, say N.
 
+config PHY_MT7623_PCIE
+   tristate "Mediatek PCIe PHY driver for MT7623 SoC families"
+   depends on ARCH_MEDIATEK && OF
+   select GENERIC_PHY
+   select MFD_SYSCON
+   help
+ Say 'Y' here to add support for Mediatek PCIe PHY driver which
+ can be found on the MT7623 SoC families.
 endmenu
diff --git a/drivers/phy/Makefile b/drivers/phy/Makefile
index f8047b4..0185f6a 100644
--- a/drivers/phy/Makefile
+++ b/drivers/phy/Makefile
@@ -64,3 +64,4 @@ obj-$(CONFIG_PHY_CYGNUS_PCIE) += phy-bcm-cygnus-pcie.o
 obj-$(CONFIG_ARCH_TEGRA) += tegra/
 obj-$(CONFIG_PHY_NS2_PCIE) += phy-bcm-ns2-pcie.o
 obj-$(CONFIG_PHY_MESON8B_USB2) += phy-meson8b-usb2.o
+obj-$(CONFIG_PHY_MT7623_PCIE)  += phy-mt7623-pcie.o
diff --git a/drivers/phy/phy-mt7623-pcie.c b/drivers/phy/phy-mt7623-pcie.c
new file mode 100644
index 000..9c32601
--- /dev/null
+++ b/drivers/phy/phy-mt7623-pcie.c
@@ -0,0 +1,290 @@
+/*
+ * Copyright (c) 2017 MediaTek Inc.
+ * Author: Ryder Lee 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/* Offsets of sub-segment in each port registers */
+#define PCIE_SIFSLV_PHYD_BANK2_BASE0xa00
+#define SSUSB_SIFSLV_PHYA_BASE 0xb00
+#define SSUSB_SIFSLV_PHYA_DA_BASE  0xc00
+
+/*
+ * RX detection stable - 1 scale represent 8 reference cycles
+ * cover reference clock from 1M~100MHz, 7us~40us
+ */
+#define B2_PHYD_RXDET1 (PCIE_SIFSLV_PHYD_BANK2_BASE + 0x28)
+#define RG_SSUSB_RXDET_STB2GENMASK(17, 9)
+#define RG_SSUSB_RXDET_STB2_VAL(x) ((0x1ff & (x)) << 9)
+
+#define B2_PHYD_RXDET2 (PCIE_SIFSLV_PHYD_BANK2_BASE + 0x2c)
+#define RG_SSUSB_RXDET_STB2_P3 GENMASK(8, 0)
+#define RG_SSUSB_RXDET_STB2_P3_VAL(x)  (0x1ff & (x))
+
+#define U3_PHYA_REG0   (SSUSB_SIFSLV_PHYA_BASE + 0x00)
+#define RG_PCIE_CLKDRV_OFFSET  GENMASK(3, 1)
+#define RG_PCIE_CLKDRV_OFFSET_VAL(x)   ((0x3 & (x)) << 2)
+
+#define U3_PHYA_REG1   (SSUSB_SIFSLV_PHYA_BASE + 0x04)
+#define RG_PCIE_CLKDRV_AMP GENMASK(31, 29)
+#define RG_PCIE_CLKDRV_AMP_VAL(x)  ((0x7 & (x)) << 29)
+
+#define DA_SSUSB_CDR_REFCK_SEL (SSUSB_SIFSLV_PHYA_DA_BASE + 0x00)
+#define RG_SSUSB_XTAL_EXT_PE1H GENMASK(13, 12)
+#define RG_SSUSB_XTAL_EXT_PE1H_VAL(x)  ((0x3 & (x)) << 12)
+#define RG_SSUSB_XTAL_EXT_PE2H GENMASK(17, 16)
+#define RG_SSUSB_XTAL_EXT_PE2H_VAL(x)  ((0x3 & (x)) << 16)
+
+#define DA_SSUSB_PLL_IC(SSUSB_SIFSLV_PHYA_DA_BASE + 
0x0c)
+#define RG_SSUSB_PLL_IC_PE2H   GENMASK(15, 12)
+#define RG_SSUSB_PLL_IC_PE2H_VAL(x)((0xf & (x)) << 12)
+#define RG_SSUSB_PLL_BR_PE2H   GENMASK(29, 28)
+#define RG_SSUSB_PLL_BR_PE2H_VAL(x)((0x3 & (x)) << 28)
+
+#define DA_SSUSB_PLL_BC(SSUSB_SIFSLV_PHYA_DA_BASE + 
0x08)
+#define RG_SSUSB_PLL_DIVEN_PE2HGENMASK(21, 19)
+#define RG_SSUSB_PLL_BC_PE2H   GENMASK(7, 6)
+#define RG_SSUSB_PLL_BC_PE2H_VAL(x)((0x3 & (x)) << 6)
+
+#define DA_SSUSB_PLL_IR(SSUSB_SIFSLV_PHYA_DA_BASE + 
0x10)
+#define RG_SSUSB_PLL_IR_PE2H   GENMASK(19, 16)
+#define RG_SSUSB_PLL_IR_PE2H_VAL(x)((0xf & (x)) << 16)
+
+#define DA_SSUSB_PLL_BP(SSUSB_SIFSLV_PHYA_DA_BASE + 
0x14)
+#define RG_SSUSB_PLL_BP_PE2H   GENMASK(19, 16)
+#define RG_SSUSB_PLL_BP_PE2H_VAL(x)((0xf & (x)) << 16)
+
+#define DA_SSUSB_PLL_SSC_DELTA1_REG20  (SSUSB_SIFSLV_PHYA_DA_BASE + 0x3c)
+#define RG_SSUSB_PLL_SSC_DELTA1_PE2H   GENMASK(31, 16)
+#define RG_SSUSB_PLL_SSC_DELTA1_PE2H_VAL(x)((0x & (x)) << 16)
+
+#define DA_SSUSB_PLL_SSC_DELTA_REG25   (SSUSB_SIFSLV_PHYA_DA_BASE + 0x48)
+#define RG_SSUSB_PLL_SSC_DELTA_PE2HGENMASK(15, 0)
+#define RG_SSUSB_PLL_SSC_DELTA_PE2H_VAL(x) (0x & (x))
+

[PATCH v2 1/2] phy: add PCIe PHY driver for mt7623 SoCs families

2017-05-20 Thread Ryder Lee
support PCIe PHY of MT7623 SoCs families

Signed-off-by: Ryder Lee 
---
 drivers/phy/Kconfig   |   8 ++
 drivers/phy/Makefile  |   1 +
 drivers/phy/phy-mt7623-pcie.c | 290 ++
 3 files changed, 299 insertions(+)
 create mode 100644 drivers/phy/phy-mt7623-pcie.c

diff --git a/drivers/phy/Kconfig b/drivers/phy/Kconfig
index afaf7b6..94b3e61 100644
--- a/drivers/phy/Kconfig
+++ b/drivers/phy/Kconfig
@@ -530,4 +530,12 @@ config PHY_MESON8B_USB2
  and GXBB SoCs.
  If unsure, say N.
 
+config PHY_MT7623_PCIE
+   tristate "Mediatek PCIe PHY driver for MT7623 SoC families"
+   depends on ARCH_MEDIATEK && OF
+   select GENERIC_PHY
+   select MFD_SYSCON
+   help
+ Say 'Y' here to add support for Mediatek PCIe PHY driver which
+ can be found on the MT7623 SoC families.
 endmenu
diff --git a/drivers/phy/Makefile b/drivers/phy/Makefile
index f8047b4..0185f6a 100644
--- a/drivers/phy/Makefile
+++ b/drivers/phy/Makefile
@@ -64,3 +64,4 @@ obj-$(CONFIG_PHY_CYGNUS_PCIE) += phy-bcm-cygnus-pcie.o
 obj-$(CONFIG_ARCH_TEGRA) += tegra/
 obj-$(CONFIG_PHY_NS2_PCIE) += phy-bcm-ns2-pcie.o
 obj-$(CONFIG_PHY_MESON8B_USB2) += phy-meson8b-usb2.o
+obj-$(CONFIG_PHY_MT7623_PCIE)  += phy-mt7623-pcie.o
diff --git a/drivers/phy/phy-mt7623-pcie.c b/drivers/phy/phy-mt7623-pcie.c
new file mode 100644
index 000..9c32601
--- /dev/null
+++ b/drivers/phy/phy-mt7623-pcie.c
@@ -0,0 +1,290 @@
+/*
+ * Copyright (c) 2017 MediaTek Inc.
+ * Author: Ryder Lee 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/* Offsets of sub-segment in each port registers */
+#define PCIE_SIFSLV_PHYD_BANK2_BASE0xa00
+#define SSUSB_SIFSLV_PHYA_BASE 0xb00
+#define SSUSB_SIFSLV_PHYA_DA_BASE  0xc00
+
+/*
+ * RX detection stable - 1 scale represent 8 reference cycles
+ * cover reference clock from 1M~100MHz, 7us~40us
+ */
+#define B2_PHYD_RXDET1 (PCIE_SIFSLV_PHYD_BANK2_BASE + 0x28)
+#define RG_SSUSB_RXDET_STB2GENMASK(17, 9)
+#define RG_SSUSB_RXDET_STB2_VAL(x) ((0x1ff & (x)) << 9)
+
+#define B2_PHYD_RXDET2 (PCIE_SIFSLV_PHYD_BANK2_BASE + 0x2c)
+#define RG_SSUSB_RXDET_STB2_P3 GENMASK(8, 0)
+#define RG_SSUSB_RXDET_STB2_P3_VAL(x)  (0x1ff & (x))
+
+#define U3_PHYA_REG0   (SSUSB_SIFSLV_PHYA_BASE + 0x00)
+#define RG_PCIE_CLKDRV_OFFSET  GENMASK(3, 1)
+#define RG_PCIE_CLKDRV_OFFSET_VAL(x)   ((0x3 & (x)) << 2)
+
+#define U3_PHYA_REG1   (SSUSB_SIFSLV_PHYA_BASE + 0x04)
+#define RG_PCIE_CLKDRV_AMP GENMASK(31, 29)
+#define RG_PCIE_CLKDRV_AMP_VAL(x)  ((0x7 & (x)) << 29)
+
+#define DA_SSUSB_CDR_REFCK_SEL (SSUSB_SIFSLV_PHYA_DA_BASE + 0x00)
+#define RG_SSUSB_XTAL_EXT_PE1H GENMASK(13, 12)
+#define RG_SSUSB_XTAL_EXT_PE1H_VAL(x)  ((0x3 & (x)) << 12)
+#define RG_SSUSB_XTAL_EXT_PE2H GENMASK(17, 16)
+#define RG_SSUSB_XTAL_EXT_PE2H_VAL(x)  ((0x3 & (x)) << 16)
+
+#define DA_SSUSB_PLL_IC(SSUSB_SIFSLV_PHYA_DA_BASE + 
0x0c)
+#define RG_SSUSB_PLL_IC_PE2H   GENMASK(15, 12)
+#define RG_SSUSB_PLL_IC_PE2H_VAL(x)((0xf & (x)) << 12)
+#define RG_SSUSB_PLL_BR_PE2H   GENMASK(29, 28)
+#define RG_SSUSB_PLL_BR_PE2H_VAL(x)((0x3 & (x)) << 28)
+
+#define DA_SSUSB_PLL_BC(SSUSB_SIFSLV_PHYA_DA_BASE + 
0x08)
+#define RG_SSUSB_PLL_DIVEN_PE2HGENMASK(21, 19)
+#define RG_SSUSB_PLL_BC_PE2H   GENMASK(7, 6)
+#define RG_SSUSB_PLL_BC_PE2H_VAL(x)((0x3 & (x)) << 6)
+
+#define DA_SSUSB_PLL_IR(SSUSB_SIFSLV_PHYA_DA_BASE + 
0x10)
+#define RG_SSUSB_PLL_IR_PE2H   GENMASK(19, 16)
+#define RG_SSUSB_PLL_IR_PE2H_VAL(x)((0xf & (x)) << 16)
+
+#define DA_SSUSB_PLL_BP(SSUSB_SIFSLV_PHYA_DA_BASE + 
0x14)
+#define RG_SSUSB_PLL_BP_PE2H   GENMASK(19, 16)
+#define RG_SSUSB_PLL_BP_PE2H_VAL(x)((0xf & (x)) << 16)
+
+#define DA_SSUSB_PLL_SSC_DELTA1_REG20  (SSUSB_SIFSLV_PHYA_DA_BASE + 0x3c)
+#define RG_SSUSB_PLL_SSC_DELTA1_PE2H   GENMASK(31, 16)
+#define RG_SSUSB_PLL_SSC_DELTA1_PE2H_VAL(x)((0x & (x)) << 16)
+
+#define DA_SSUSB_PLL_SSC_DELTA_REG25   (SSUSB_SIFSLV_PHYA_DA_BASE + 0x48)
+#define RG_SSUSB_PLL_SSC_DELTA_PE2HGENMASK(15, 0)
+#define RG_SSUSB_PLL_SSC_DELTA_PE2H_VAL(x) (0x & (x))
+
+#define HIF_SYSCFG10x14

[PATCH v2 0/2] Add PCIe phy driver for some Mediatek SoCs

2017-05-20 Thread Ryder Lee
Hi,

This patch series add PCIe phy driver and related dt-binding file for
Mediatek mt7623 SoCs families.

Changes since v2:
- rebase to Linux 4.12-rc1

Changes since v1:
- revise binding document:
  drop 'status' properties.
  add a description to 'phy-switch' property and add vendor prefix.

Ryder Lee (2):
  phy: add PCIe PHY driver for mt7623 SoCs families
  dt-bindings: phy: Add documentation for Mediatek PCIe PHY

 .../devicetree/bindings/phy/phy-mt7623-pcie.txt|  63 +
 drivers/phy/Kconfig|   8 +
 drivers/phy/Makefile   |   1 +
 drivers/phy/phy-mt7623-pcie.c  | 290 +
 4 files changed, 362 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/phy/phy-mt7623-pcie.txt
 create mode 100644 drivers/phy/phy-mt7623-pcie.c

-- 
1.9.1



[PATCH v2 0/2] Add PCIe phy driver for some Mediatek SoCs

2017-05-20 Thread Ryder Lee
Hi,

This patch series add PCIe phy driver and related dt-binding file for
Mediatek mt7623 SoCs families.

Changes since v2:
- rebase to Linux 4.12-rc1

Changes since v1:
- revise binding document:
  drop 'status' properties.
  add a description to 'phy-switch' property and add vendor prefix.

Ryder Lee (2):
  phy: add PCIe PHY driver for mt7623 SoCs families
  dt-bindings: phy: Add documentation for Mediatek PCIe PHY

 .../devicetree/bindings/phy/phy-mt7623-pcie.txt|  63 +
 drivers/phy/Kconfig|   8 +
 drivers/phy/Makefile   |   1 +
 drivers/phy/phy-mt7623-pcie.c  | 290 +
 4 files changed, 362 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/phy/phy-mt7623-pcie.txt
 create mode 100644 drivers/phy/phy-mt7623-pcie.c

-- 
1.9.1



[PATCH v5 2/2] dt-bindings: pcie: Add documentation for Mediatek PCIe

2017-05-20 Thread Ryder Lee
Add documentation for PCIe host driver available in MT7623
series SoCs.

Signed-off-by: Ryder Lee 
Acked-by: Rob Herring 
---
 .../bindings/pci/mediatek,mt7623-pcie.txt  | 130 +
 1 file changed, 130 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/pci/mediatek,mt7623-pcie.txt

diff --git a/Documentation/devicetree/bindings/pci/mediatek,mt7623-pcie.txt 
b/Documentation/devicetree/bindings/pci/mediatek,mt7623-pcie.txt
new file mode 100644
index 000..ae4a3f4
--- /dev/null
+++ b/Documentation/devicetree/bindings/pci/mediatek,mt7623-pcie.txt
@@ -0,0 +1,130 @@
+Mediatek Gen2 PCIe controller which is available on MT7623 series SoCs
+
+PCIe subsys supports single root complex (RC) with 3 Root Ports. Each root
+ports supports a Gen2 1-lane Link and has PIPE interface to PHY.
+
+Required properties:
+- compatible: Should contain "mediatek,mt7623-pcie".
+- device_type: Must be "pci"
+- reg: Base addresses and lengths of the PCIe controller.
+- #address-cells: Address representation for root ports (must be 3)
+- #size-cells: Size representation for root ports (must be 2)
+- #interrupt-cells: Size representation for interrupts (must be 1)
+- interrupt-map-mask and interrupt-map: Standard PCI IRQ mapping properties
+  Please refer to the standard PCI bus binding document for a more detailed
+  explanation.
+- clocks: Must contain an entry for each entry in clock-names.
+  See ../clocks/clock-bindings.txt for details.
+- clock-names: Must include the following entries:
+  - free_ck :for reference clock of PCIe subsys
+  - sys_ck0 :for clock of Port0
+  - sys_ck1 :for clock of Port1
+  - sys_ck2 :for clock of Port2
+- resets: Must contain an entry for each entry in reset-names.
+  See ../reset/reset.txt for details.
+- reset-names: Must include the following entries:
+  - pcie-rst0 :port0 reset
+  - pcie-rst1 :port1 reset
+  - pcie-rst2 :port2 reset
+- phys: List of PHY specifiers (used by generic PHY framework).
+- phy-names : Must be "pcie-phy0", "pcie-phy1", "pcie-phyN".. based on the
+  number of PHYs as specified in *phys* property.
+- power-domains: A phandle and power domain specifier pair to the power domain
+  which is responsible for collapsing and restoring power to the peripheral.
+- bus-range: Range of bus numbers associated with this controller.
+- ranges: Ranges for the PCI memory and I/O regions.
+
+In addition, the device tree node must have sub-nodes describing each
+PCIe port interface, having the following mandatory properties:
+
+Required properties:
+- device_type: Must be "pci"
+- reg: Only the first four bytes are used to refer to the correct bus number
+  and device number.
+- #address-cells: Must be 3
+- #size-cells: Must be 2
+- #interrupt-cells: Must be 1
+- interrupt-map-mask and interrupt-map: Standard PCI IRQ mapping properties
+  Please refer to the standard PCI bus binding document for a more detailed
+  explanation.
+- ranges: Sub-ranges distributed from the PCIe controller node. An empty
+  property is sufficient.
+- num-lanes: Number of lanes to use for this port.
+
+Examples:
+
+   hifsys: syscon@1a00 {
+   compatible = "mediatek,mt7623-hifsys",
+"mediatek,mt2701-hifsys",
+"syscon";
+   reg = <0 0x1a00 0 0x1000>;
+   #clock-cells = <1>;
+   #reset-cells = <1>;
+   };
+
+   pcie: pcie-controller@1a14 {
+   compatible = "mediatek,mt7623-pcie";
+   device_type = "pci";
+   reg = <0 0x1a14 0 0x1000>, /* PCIe shared registers */
+ <0 0x1a142000 0 0x1000>, /* Port0 registers */
+ <0 0x1a143000 0 0x1000>, /* Port1 registers */
+ <0 0x1a144000 0 0x1000>; /* Port2 registers */
+   #address-cells = <3>;
+   #size-cells = <2>;
+   #interrupt-cells = <1>;
+   interrupt-map-mask = <0xf800 0 0 0>;
+   interrupt-map = <0x 0 0 0  GIC_SPI 193 
IRQ_TYPE_LEVEL_LOW>,
+   <0x0800 0 0 0  GIC_SPI 194 
IRQ_TYPE_LEVEL_LOW>,
+   <0x1000 0 0 0  GIC_SPI 195 
IRQ_TYPE_LEVEL_LOW>;
+   clocks = < CLK_TOP_ETHIF_SEL>,
+< CLK_HIFSYS_PCIE0>,
+< CLK_HIFSYS_PCIE1>,
+< CLK_HIFSYS_PCIE2>;
+   clock-names = "free_ck", "sys_ck0", "sys_ck1", "sys_ck2";
+   resets = < MT2701_HIFSYS_PCIE0_RST>,
+< MT2701_HIFSYS_PCIE1_RST>,
+< MT2701_HIFSYS_PCIE2_RST>;
+   reset-names = "pcie-rst0", "pcie-rst1", "pcie-rst2";
+   phys = <_phy>, <_phy>, <_phy>;
+   phy-names = "pcie-phy0", "pcie-phy1", "pcie-phy2";
+   power-domains = < MT2701_POWER_DOMAIN_HIF>;
+   bus-range = <0x00 0xff>;
+

[PATCH v5 2/2] dt-bindings: pcie: Add documentation for Mediatek PCIe

2017-05-20 Thread Ryder Lee
Add documentation for PCIe host driver available in MT7623
series SoCs.

Signed-off-by: Ryder Lee 
Acked-by: Rob Herring 
---
 .../bindings/pci/mediatek,mt7623-pcie.txt  | 130 +
 1 file changed, 130 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/pci/mediatek,mt7623-pcie.txt

diff --git a/Documentation/devicetree/bindings/pci/mediatek,mt7623-pcie.txt 
b/Documentation/devicetree/bindings/pci/mediatek,mt7623-pcie.txt
new file mode 100644
index 000..ae4a3f4
--- /dev/null
+++ b/Documentation/devicetree/bindings/pci/mediatek,mt7623-pcie.txt
@@ -0,0 +1,130 @@
+Mediatek Gen2 PCIe controller which is available on MT7623 series SoCs
+
+PCIe subsys supports single root complex (RC) with 3 Root Ports. Each root
+ports supports a Gen2 1-lane Link and has PIPE interface to PHY.
+
+Required properties:
+- compatible: Should contain "mediatek,mt7623-pcie".
+- device_type: Must be "pci"
+- reg: Base addresses and lengths of the PCIe controller.
+- #address-cells: Address representation for root ports (must be 3)
+- #size-cells: Size representation for root ports (must be 2)
+- #interrupt-cells: Size representation for interrupts (must be 1)
+- interrupt-map-mask and interrupt-map: Standard PCI IRQ mapping properties
+  Please refer to the standard PCI bus binding document for a more detailed
+  explanation.
+- clocks: Must contain an entry for each entry in clock-names.
+  See ../clocks/clock-bindings.txt for details.
+- clock-names: Must include the following entries:
+  - free_ck :for reference clock of PCIe subsys
+  - sys_ck0 :for clock of Port0
+  - sys_ck1 :for clock of Port1
+  - sys_ck2 :for clock of Port2
+- resets: Must contain an entry for each entry in reset-names.
+  See ../reset/reset.txt for details.
+- reset-names: Must include the following entries:
+  - pcie-rst0 :port0 reset
+  - pcie-rst1 :port1 reset
+  - pcie-rst2 :port2 reset
+- phys: List of PHY specifiers (used by generic PHY framework).
+- phy-names : Must be "pcie-phy0", "pcie-phy1", "pcie-phyN".. based on the
+  number of PHYs as specified in *phys* property.
+- power-domains: A phandle and power domain specifier pair to the power domain
+  which is responsible for collapsing and restoring power to the peripheral.
+- bus-range: Range of bus numbers associated with this controller.
+- ranges: Ranges for the PCI memory and I/O regions.
+
+In addition, the device tree node must have sub-nodes describing each
+PCIe port interface, having the following mandatory properties:
+
+Required properties:
+- device_type: Must be "pci"
+- reg: Only the first four bytes are used to refer to the correct bus number
+  and device number.
+- #address-cells: Must be 3
+- #size-cells: Must be 2
+- #interrupt-cells: Must be 1
+- interrupt-map-mask and interrupt-map: Standard PCI IRQ mapping properties
+  Please refer to the standard PCI bus binding document for a more detailed
+  explanation.
+- ranges: Sub-ranges distributed from the PCIe controller node. An empty
+  property is sufficient.
+- num-lanes: Number of lanes to use for this port.
+
+Examples:
+
+   hifsys: syscon@1a00 {
+   compatible = "mediatek,mt7623-hifsys",
+"mediatek,mt2701-hifsys",
+"syscon";
+   reg = <0 0x1a00 0 0x1000>;
+   #clock-cells = <1>;
+   #reset-cells = <1>;
+   };
+
+   pcie: pcie-controller@1a14 {
+   compatible = "mediatek,mt7623-pcie";
+   device_type = "pci";
+   reg = <0 0x1a14 0 0x1000>, /* PCIe shared registers */
+ <0 0x1a142000 0 0x1000>, /* Port0 registers */
+ <0 0x1a143000 0 0x1000>, /* Port1 registers */
+ <0 0x1a144000 0 0x1000>; /* Port2 registers */
+   #address-cells = <3>;
+   #size-cells = <2>;
+   #interrupt-cells = <1>;
+   interrupt-map-mask = <0xf800 0 0 0>;
+   interrupt-map = <0x 0 0 0  GIC_SPI 193 
IRQ_TYPE_LEVEL_LOW>,
+   <0x0800 0 0 0  GIC_SPI 194 
IRQ_TYPE_LEVEL_LOW>,
+   <0x1000 0 0 0  GIC_SPI 195 
IRQ_TYPE_LEVEL_LOW>;
+   clocks = < CLK_TOP_ETHIF_SEL>,
+< CLK_HIFSYS_PCIE0>,
+< CLK_HIFSYS_PCIE1>,
+< CLK_HIFSYS_PCIE2>;
+   clock-names = "free_ck", "sys_ck0", "sys_ck1", "sys_ck2";
+   resets = < MT2701_HIFSYS_PCIE0_RST>,
+< MT2701_HIFSYS_PCIE1_RST>,
+< MT2701_HIFSYS_PCIE2_RST>;
+   reset-names = "pcie-rst0", "pcie-rst1", "pcie-rst2";
+   phys = <_phy>, <_phy>, <_phy>;
+   phy-names = "pcie-phy0", "pcie-phy1", "pcie-phy2";
+   power-domains = < MT2701_POWER_DOMAIN_HIF>;
+   bus-range = <0x00 0xff>;
+   ranges = <0x8100 0 0x1a16 0 

[PATCH v5 0/2] Add PCIe host driver support for Mediatek SoCs

2017-05-20 Thread Ryder Lee
Hi,

This patch series add Mediatek Gen2 PCIe host controller driver and
dt-binding document. It can be found on MT7623 series SoCs.

This driver was validated using Broadcom Tigon3 and Intel(R) 82575/82576
gigabit ethernet card.


Changes since v5:
- rebase to Linux 4.12-rc1.
- remove redundant module.h header and MODULE macros.

Changes since v4:
- move the per-port registers to the parent node.
- use a valid compatible for hifsys controller.
- use the 'sysirq' instead of 'gic' as a correct 'interrupt-parent' of the
  interrupt-map properties.

  'sysirq' is an interrupt-controller that could help us to reverse GIC SPIs 
polarity
  so that we could properly set irq type to level low without any extra 
properties.
  It was a mistake to select wrong interrupt-parent on the previous versions.
  Now, we could remove unnecessary interrupt properties entirely from binding.

Changes since v3:
- correct sub-nodes unit addresses.

Changes since v2:
- modify Kconfig to avoid kbuild test error on some architecture.
- change compatible string.
- revise binding document:
  add missing interrupt-names.
  remove the board dts example and drop 'status' properties.
  remove unnecessary descriptions bout standard PCI bus binding.

Changes since v1:
- add .suppress_bind_attrs.
- remove unnecessary *_valid_device() pattern.
- remove PCI_PROBE_ONLY.
- use the regular readl() instead of readl_relaxed().
- add .map_bus() and change to use 
pci_generic_config_read/pci_generic_config_write.
- revise dt-binding document and move nonstandard properties to root node.
- change compatible string.
- use interrupt-map property and replace mtk_pcie_map_irq() with 
of_irq_parse_and_map_pci().
- use the new pci_register_host_bridge() method instead of pci_scan_root_bus()*

Ryder Lee (2):
  PCI: mediatek: Add Mediatek PCIe host controller support
  dt-bindings: pcie: Add documentation for Mediatek PCIe

 .../bindings/pci/mediatek,mt7623-pcie.txt  | 130 +
 drivers/pci/host/Kconfig   |  11 +
 drivers/pci/host/Makefile  |   1 +
 drivers/pci/host/pcie-mediatek.c   | 553 +
 4 files changed, 695 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/pci/mediatek,mt7623-pcie.txt
 create mode 100644 drivers/pci/host/pcie-mediatek.c

-- 
1.9.1



[PATCH v5 1/2] PCI: mediatek: Add Mediatek PCIe host controller support

2017-05-20 Thread Ryder Lee
Add support for the Mediatek PCIe Gen2 controller which can
be found on MT7623 series SoCs.

Signed-off-by: Ryder Lee 
---
 drivers/pci/host/Kconfig |  11 +
 drivers/pci/host/Makefile|   1 +
 drivers/pci/host/pcie-mediatek.c | 553 +++
 3 files changed, 565 insertions(+)
 create mode 100644 drivers/pci/host/pcie-mediatek.c

diff --git a/drivers/pci/host/Kconfig b/drivers/pci/host/Kconfig
index 7f47cd5..d7d7c47 100644
--- a/drivers/pci/host/Kconfig
+++ b/drivers/pci/host/Kconfig
@@ -180,6 +180,17 @@ config PCIE_ROCKCHIP
  There is 1 internal PCIe port available to support GEN2 with
  4 slots.
 
+config PCIE_MEDIATEK
+   bool "Mediatek PCIe controller"
+   depends on ARM && (ARCH_MEDIATEK || COMPILE_TEST)
+   depends on OF
+   depends on PCI
+   select PCIEPORTBUS
+   help
+ Say Y here if you want to enable PCIe controller support on MT7623 
series
+ SoCs. There is one single root complex with 3 root ports available.
+ Each port supports Gen2 lane x1.
+
 config VMD
depends on PCI_MSI && X86_64 && SRCU
tristate "Intel Volume Management Device Driver"
diff --git a/drivers/pci/host/Makefile b/drivers/pci/host/Makefile
index cab8795..b10d104 100644
--- a/drivers/pci/host/Makefile
+++ b/drivers/pci/host/Makefile
@@ -18,6 +18,7 @@ obj-$(CONFIG_PCIE_IPROC_BCMA) += pcie-iproc-bcma.o
 obj-$(CONFIG_PCIE_ALTERA) += pcie-altera.o
 obj-$(CONFIG_PCIE_ALTERA_MSI) += pcie-altera-msi.o
 obj-$(CONFIG_PCIE_ROCKCHIP) += pcie-rockchip.o
+obj-$(CONFIG_PCIE_MEDIATEK) += pcie-mediatek.o
 obj-$(CONFIG_VMD) += vmd.o
 
 # The following drivers are for devices that use the generic ACPI
diff --git a/drivers/pci/host/pcie-mediatek.c b/drivers/pci/host/pcie-mediatek.c
new file mode 100644
index 000..0cc7943
--- /dev/null
+++ b/drivers/pci/host/pcie-mediatek.c
@@ -0,0 +1,553 @@
+/*
+ * Mediatek PCIe host controller driver.
+ *
+ * Copyright (c) 2017 MediaTek Inc.
+ * Author: Ryder Lee 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/* PCIe shared registers */
+#define PCIE_SYS_CFG   0x00
+#define PCIE_INT_ENABLE0x0c
+#define PCIE_CFG_ADDR  0x20
+#define PCIE_CFG_DATA  0x24
+
+/* PCIe per port registers */
+#define PCIE_BAR0_SETUP0x10
+#define PCIE_BAR1_SETUP0x14
+#define PCIE_BAR0_MEM_BASE 0x18
+#define PCIE_CLASS 0x34
+#define PCIE_LINK_STATUS   0x50
+
+#define PCIE_PORT_INT_EN(x)BIT(20 + (x))
+#define PCIE_PORT_PERST(x) BIT(1 + (x))
+#define PCIE_PORT_LINKUP   BIT(0)
+#define PCIE_BAR_MAP_MAX   GENMASK(31, 16)
+
+#define PCIE_BAR_ENABLEBIT(0)
+#define PCIE_REVISION_ID   BIT(0)
+#define PCIE_CLASS_CODE(0x60400 << 8)
+#define PCIE_CONF_REG(regn)(((regn) & GENMASK(7, 2)) | \
+   regn) >> 8) & GENMASK(3, 0)) << 24))
+#define PCIE_CONF_FUN(fun) (((fun) << 8) & GENMASK(10, 8))
+#define PCIE_CONF_DEV(dev) (((dev) << 11) & GENMASK(15, 11))
+#define PCIE_CONF_BUS(bus) (((bus) << 16) & GENMASK(23, 16))
+#define PCIE_CONF_ADDR(regn, fun, dev, bus) \
+   (PCIE_CONF_REG(regn) | PCIE_CONF_FUN(fun) | \
+PCIE_CONF_DEV(dev) | PCIE_CONF_BUS(bus))
+
+/* Mediatek specific configuration registers */
+#define PCIE_FTS_NUM   0x70c
+#define PCIE_FTS_NUM_MASK  GENMASK(15, 8)
+#define PCIE_FTS_NUM_L0(x) ((x) & 0xff << 8)
+
+#define PCIE_FC_CREDIT 0x73c
+#define PCIE_FC_CREDIT_MASK(GENMASK(31, 31) | GENMASK(28, 16))
+#define PCIE_FC_CREDIT_VAL(x)  ((x) << 16)
+
+/**
+ * struct mtk_pcie_port - PCIe port information
+ * @base: IO mapped register base
+ * @list: port list
+ * @pcie: pointer to PCIe host info
+ * @reset: pointer to port reset control
+ * @sys_ck: pointer to bus clock
+ * @phy: pointer to phy control block
+ * @lane: lane count
+ * @index: port index
+ */
+struct mtk_pcie_port {
+   void __iomem *base;
+   struct list_head list;
+   struct mtk_pcie *pcie;
+   struct reset_control *reset;
+   struct clk *sys_ck;
+   struct phy *phy;
+   u32 lane;
+   u32 index;
+};
+
+/**
+ * struct mtk_pcie - PCIe host information
+ * @dev: pointer to PCIe device
+ * @base: IO mapped register Base
+ * @free_ck: free-run reference clock
+ * @io: IO resource
+ * @pio: PIO resource
+ * @mem: non-prefetchable 

[PATCH v5 0/2] Add PCIe host driver support for Mediatek SoCs

2017-05-20 Thread Ryder Lee
Hi,

This patch series add Mediatek Gen2 PCIe host controller driver and
dt-binding document. It can be found on MT7623 series SoCs.

This driver was validated using Broadcom Tigon3 and Intel(R) 82575/82576
gigabit ethernet card.


Changes since v5:
- rebase to Linux 4.12-rc1.
- remove redundant module.h header and MODULE macros.

Changes since v4:
- move the per-port registers to the parent node.
- use a valid compatible for hifsys controller.
- use the 'sysirq' instead of 'gic' as a correct 'interrupt-parent' of the
  interrupt-map properties.

  'sysirq' is an interrupt-controller that could help us to reverse GIC SPIs 
polarity
  so that we could properly set irq type to level low without any extra 
properties.
  It was a mistake to select wrong interrupt-parent on the previous versions.
  Now, we could remove unnecessary interrupt properties entirely from binding.

Changes since v3:
- correct sub-nodes unit addresses.

Changes since v2:
- modify Kconfig to avoid kbuild test error on some architecture.
- change compatible string.
- revise binding document:
  add missing interrupt-names.
  remove the board dts example and drop 'status' properties.
  remove unnecessary descriptions bout standard PCI bus binding.

Changes since v1:
- add .suppress_bind_attrs.
- remove unnecessary *_valid_device() pattern.
- remove PCI_PROBE_ONLY.
- use the regular readl() instead of readl_relaxed().
- add .map_bus() and change to use 
pci_generic_config_read/pci_generic_config_write.
- revise dt-binding document and move nonstandard properties to root node.
- change compatible string.
- use interrupt-map property and replace mtk_pcie_map_irq() with 
of_irq_parse_and_map_pci().
- use the new pci_register_host_bridge() method instead of pci_scan_root_bus()*

Ryder Lee (2):
  PCI: mediatek: Add Mediatek PCIe host controller support
  dt-bindings: pcie: Add documentation for Mediatek PCIe

 .../bindings/pci/mediatek,mt7623-pcie.txt  | 130 +
 drivers/pci/host/Kconfig   |  11 +
 drivers/pci/host/Makefile  |   1 +
 drivers/pci/host/pcie-mediatek.c   | 553 +
 4 files changed, 695 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/pci/mediatek,mt7623-pcie.txt
 create mode 100644 drivers/pci/host/pcie-mediatek.c

-- 
1.9.1



[PATCH v5 1/2] PCI: mediatek: Add Mediatek PCIe host controller support

2017-05-20 Thread Ryder Lee
Add support for the Mediatek PCIe Gen2 controller which can
be found on MT7623 series SoCs.

Signed-off-by: Ryder Lee 
---
 drivers/pci/host/Kconfig |  11 +
 drivers/pci/host/Makefile|   1 +
 drivers/pci/host/pcie-mediatek.c | 553 +++
 3 files changed, 565 insertions(+)
 create mode 100644 drivers/pci/host/pcie-mediatek.c

diff --git a/drivers/pci/host/Kconfig b/drivers/pci/host/Kconfig
index 7f47cd5..d7d7c47 100644
--- a/drivers/pci/host/Kconfig
+++ b/drivers/pci/host/Kconfig
@@ -180,6 +180,17 @@ config PCIE_ROCKCHIP
  There is 1 internal PCIe port available to support GEN2 with
  4 slots.
 
+config PCIE_MEDIATEK
+   bool "Mediatek PCIe controller"
+   depends on ARM && (ARCH_MEDIATEK || COMPILE_TEST)
+   depends on OF
+   depends on PCI
+   select PCIEPORTBUS
+   help
+ Say Y here if you want to enable PCIe controller support on MT7623 
series
+ SoCs. There is one single root complex with 3 root ports available.
+ Each port supports Gen2 lane x1.
+
 config VMD
depends on PCI_MSI && X86_64 && SRCU
tristate "Intel Volume Management Device Driver"
diff --git a/drivers/pci/host/Makefile b/drivers/pci/host/Makefile
index cab8795..b10d104 100644
--- a/drivers/pci/host/Makefile
+++ b/drivers/pci/host/Makefile
@@ -18,6 +18,7 @@ obj-$(CONFIG_PCIE_IPROC_BCMA) += pcie-iproc-bcma.o
 obj-$(CONFIG_PCIE_ALTERA) += pcie-altera.o
 obj-$(CONFIG_PCIE_ALTERA_MSI) += pcie-altera-msi.o
 obj-$(CONFIG_PCIE_ROCKCHIP) += pcie-rockchip.o
+obj-$(CONFIG_PCIE_MEDIATEK) += pcie-mediatek.o
 obj-$(CONFIG_VMD) += vmd.o
 
 # The following drivers are for devices that use the generic ACPI
diff --git a/drivers/pci/host/pcie-mediatek.c b/drivers/pci/host/pcie-mediatek.c
new file mode 100644
index 000..0cc7943
--- /dev/null
+++ b/drivers/pci/host/pcie-mediatek.c
@@ -0,0 +1,553 @@
+/*
+ * Mediatek PCIe host controller driver.
+ *
+ * Copyright (c) 2017 MediaTek Inc.
+ * Author: Ryder Lee 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/* PCIe shared registers */
+#define PCIE_SYS_CFG   0x00
+#define PCIE_INT_ENABLE0x0c
+#define PCIE_CFG_ADDR  0x20
+#define PCIE_CFG_DATA  0x24
+
+/* PCIe per port registers */
+#define PCIE_BAR0_SETUP0x10
+#define PCIE_BAR1_SETUP0x14
+#define PCIE_BAR0_MEM_BASE 0x18
+#define PCIE_CLASS 0x34
+#define PCIE_LINK_STATUS   0x50
+
+#define PCIE_PORT_INT_EN(x)BIT(20 + (x))
+#define PCIE_PORT_PERST(x) BIT(1 + (x))
+#define PCIE_PORT_LINKUP   BIT(0)
+#define PCIE_BAR_MAP_MAX   GENMASK(31, 16)
+
+#define PCIE_BAR_ENABLEBIT(0)
+#define PCIE_REVISION_ID   BIT(0)
+#define PCIE_CLASS_CODE(0x60400 << 8)
+#define PCIE_CONF_REG(regn)(((regn) & GENMASK(7, 2)) | \
+   regn) >> 8) & GENMASK(3, 0)) << 24))
+#define PCIE_CONF_FUN(fun) (((fun) << 8) & GENMASK(10, 8))
+#define PCIE_CONF_DEV(dev) (((dev) << 11) & GENMASK(15, 11))
+#define PCIE_CONF_BUS(bus) (((bus) << 16) & GENMASK(23, 16))
+#define PCIE_CONF_ADDR(regn, fun, dev, bus) \
+   (PCIE_CONF_REG(regn) | PCIE_CONF_FUN(fun) | \
+PCIE_CONF_DEV(dev) | PCIE_CONF_BUS(bus))
+
+/* Mediatek specific configuration registers */
+#define PCIE_FTS_NUM   0x70c
+#define PCIE_FTS_NUM_MASK  GENMASK(15, 8)
+#define PCIE_FTS_NUM_L0(x) ((x) & 0xff << 8)
+
+#define PCIE_FC_CREDIT 0x73c
+#define PCIE_FC_CREDIT_MASK(GENMASK(31, 31) | GENMASK(28, 16))
+#define PCIE_FC_CREDIT_VAL(x)  ((x) << 16)
+
+/**
+ * struct mtk_pcie_port - PCIe port information
+ * @base: IO mapped register base
+ * @list: port list
+ * @pcie: pointer to PCIe host info
+ * @reset: pointer to port reset control
+ * @sys_ck: pointer to bus clock
+ * @phy: pointer to phy control block
+ * @lane: lane count
+ * @index: port index
+ */
+struct mtk_pcie_port {
+   void __iomem *base;
+   struct list_head list;
+   struct mtk_pcie *pcie;
+   struct reset_control *reset;
+   struct clk *sys_ck;
+   struct phy *phy;
+   u32 lane;
+   u32 index;
+};
+
+/**
+ * struct mtk_pcie - PCIe host information
+ * @dev: pointer to PCIe device
+ * @base: IO mapped register Base
+ * @free_ck: free-run reference clock
+ * @io: IO resource
+ * @pio: PIO resource
+ * @mem: non-prefetchable memory resource
+ * @busn: bus range
+ * @offset: 

[PATCH] initramfs: Fix disabling of initramfs (and its compression)

2017-05-20 Thread Florian Fainelli
Commit db2aa7fd15e8 ("initramfs: allow again choice of the embedded
initram compression algorithm") introduced the possibility to select the
initramfs compression algorithm from Kconfig and while this is a nice
feature it broke the use case described below.

Here is what my build system does:

- kernel is initially configured not to have an initramfs included
- build the user space root file system
- re-configure the kernel to have an initramfs included
(CONFIG_INITRAMFS_SOURCE="/path/to/romfs") and set relevant
CONFIG_INITRAMFS options, in my case, no compression option
(CONFIG_INITRAMFS_COMPRESSION_NONE)
- kernel is re-built with these options -> kernel+initramfs image is
  copied
- kernel is re-built again without these options -> kernel image is
  copied

Building a kernel without an initramfs means setting this option:

CONFIG_INITRAMFS_SOURCE="" (and this one only)

whereas building a kernel with an initramfs means setting these options:

CONFIG_INITRAMFS_SOURCE="/home/fainelli/work/uclinux-rootfs/romfs
/home/fainelli/work/uclinux-rootfs/misc/initramfs.dev"
CONFIG_INITRAMFS_ROOT_UID=1000
CONFIG_INITRAMFS_ROOT_GID=1000
CONFIG_INITRAMFS_COMPRESSION_NONE=y
CONFIG_INITRAMFS_COMPRESSION=""

Commit db2aa7fd15e857891cefbada8348c8d938c7a2bc ("initramfs: allow again
choice of the embedded initram compression algorithm") is problematic
because CONFIG_INITRAMFS_COMPRESSION which is used to determine the
initramfs_data.cpio extension/compression is a string, and due to how
Kconfig works it will evaluate in order, how to assign it.

Setting CONFIG_INITRAMFS_COMPRESSION_NONE with
CONFIG_INITRAMFS_SOURCE="" cannot possibly work (because of the depends
on INITRAMFS_SOURCE!="" imposed on CONFIG_INITRAMFS_COMPRESSION ) yet we
still get CONFIG_INITRAMFS_COMPRESSION assigned to ".gz" because
CONFIG_RD_GZIP=y is set in my kernel, even when there is no initramfs
being built.

So we basically end-up generating two initramfs_data.cpio* files, one
without extension, and one with .gz. This causes usr/Makefile to track
usr/initramfs_data.cpio.gz, and not usr/initramfs_data.cpio anymore,
that is also largely problematic after
9e3596b0c6539e28546ff7c72a06576627068353 ("kbuild: initramfs cleanup,
set target from Kconfig") because we used to track all possible
initramfs_data files in the $(targets) variable before that commit.

The end result is that the kernel with an initramfs clearly does not
contain what we expect it to, it has a stale initramfs_data.cpio file
built into it, and we keep re-generating an initramfs_data.cpio.gz file
which is not the one that we want to include in the kernel image proper.

The fix consists in hiding CONFIG_INITRAMFS_COMPRESSION when
CONFIG_INITRAMFS_SOURCE="". This puts us back in a state to the pre-4.10
behavior where we can properly disable and re-enable initramfs within
the same kernel .config file, and be in control of what
CONFIG_INITRAMFS_COMPRESSION is set to.

Fixes: db2aa7fd15e8 ("initramfs: allow again choice of the embedded initram 
compression algorithm")
Fixes: 9e3596b0c653 ("kbuild: initramfs cleanup, set target from Kconfig")
Signed-off-by: Florian Fainelli 
---
 usr/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/usr/Kconfig b/usr/Kconfig
index c0c48507e44e..ad0543e21760 100644
--- a/usr/Kconfig
+++ b/usr/Kconfig
@@ -220,6 +220,7 @@ config INITRAMFS_COMPRESSION_LZ4
 endchoice
 
 config INITRAMFS_COMPRESSION
+   depends on INITRAMFS_SOURCE!=""
string
default ""  if INITRAMFS_COMPRESSION_NONE
default ".gz"   if INITRAMFS_COMPRESSION_GZIP
-- 
2.9.3



[PATCH] initramfs: Fix disabling of initramfs (and its compression)

2017-05-20 Thread Florian Fainelli
Commit db2aa7fd15e8 ("initramfs: allow again choice of the embedded
initram compression algorithm") introduced the possibility to select the
initramfs compression algorithm from Kconfig and while this is a nice
feature it broke the use case described below.

Here is what my build system does:

- kernel is initially configured not to have an initramfs included
- build the user space root file system
- re-configure the kernel to have an initramfs included
(CONFIG_INITRAMFS_SOURCE="/path/to/romfs") and set relevant
CONFIG_INITRAMFS options, in my case, no compression option
(CONFIG_INITRAMFS_COMPRESSION_NONE)
- kernel is re-built with these options -> kernel+initramfs image is
  copied
- kernel is re-built again without these options -> kernel image is
  copied

Building a kernel without an initramfs means setting this option:

CONFIG_INITRAMFS_SOURCE="" (and this one only)

whereas building a kernel with an initramfs means setting these options:

CONFIG_INITRAMFS_SOURCE="/home/fainelli/work/uclinux-rootfs/romfs
/home/fainelli/work/uclinux-rootfs/misc/initramfs.dev"
CONFIG_INITRAMFS_ROOT_UID=1000
CONFIG_INITRAMFS_ROOT_GID=1000
CONFIG_INITRAMFS_COMPRESSION_NONE=y
CONFIG_INITRAMFS_COMPRESSION=""

Commit db2aa7fd15e857891cefbada8348c8d938c7a2bc ("initramfs: allow again
choice of the embedded initram compression algorithm") is problematic
because CONFIG_INITRAMFS_COMPRESSION which is used to determine the
initramfs_data.cpio extension/compression is a string, and due to how
Kconfig works it will evaluate in order, how to assign it.

Setting CONFIG_INITRAMFS_COMPRESSION_NONE with
CONFIG_INITRAMFS_SOURCE="" cannot possibly work (because of the depends
on INITRAMFS_SOURCE!="" imposed on CONFIG_INITRAMFS_COMPRESSION ) yet we
still get CONFIG_INITRAMFS_COMPRESSION assigned to ".gz" because
CONFIG_RD_GZIP=y is set in my kernel, even when there is no initramfs
being built.

So we basically end-up generating two initramfs_data.cpio* files, one
without extension, and one with .gz. This causes usr/Makefile to track
usr/initramfs_data.cpio.gz, and not usr/initramfs_data.cpio anymore,
that is also largely problematic after
9e3596b0c6539e28546ff7c72a06576627068353 ("kbuild: initramfs cleanup,
set target from Kconfig") because we used to track all possible
initramfs_data files in the $(targets) variable before that commit.

The end result is that the kernel with an initramfs clearly does not
contain what we expect it to, it has a stale initramfs_data.cpio file
built into it, and we keep re-generating an initramfs_data.cpio.gz file
which is not the one that we want to include in the kernel image proper.

The fix consists in hiding CONFIG_INITRAMFS_COMPRESSION when
CONFIG_INITRAMFS_SOURCE="". This puts us back in a state to the pre-4.10
behavior where we can properly disable and re-enable initramfs within
the same kernel .config file, and be in control of what
CONFIG_INITRAMFS_COMPRESSION is set to.

Fixes: db2aa7fd15e8 ("initramfs: allow again choice of the embedded initram 
compression algorithm")
Fixes: 9e3596b0c653 ("kbuild: initramfs cleanup, set target from Kconfig")
Signed-off-by: Florian Fainelli 
---
 usr/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/usr/Kconfig b/usr/Kconfig
index c0c48507e44e..ad0543e21760 100644
--- a/usr/Kconfig
+++ b/usr/Kconfig
@@ -220,6 +220,7 @@ config INITRAMFS_COMPRESSION_LZ4
 endchoice
 
 config INITRAMFS_COMPRESSION
+   depends on INITRAMFS_SOURCE!=""
string
default ""  if INITRAMFS_COMPRESSION_NONE
default ".gz"   if INITRAMFS_COMPRESSION_GZIP
-- 
2.9.3



[PATCH] KVM: X86: Fix preempt the preemption timer cancel

2017-05-20 Thread Wanpeng Li
From: Wanpeng Li 

 WARNING: CPU: 3 PID: 1952 at arch/x86/kvm/lapic.c:1529 
kvm_lapic_expired_hv_timer+0xb5/0xd0 [kvm]
 CPU: 3 PID: 1952 Comm: qemu-system-x86 Not tainted 4.12.0-rc1+ #24 RIP: 
0010:kvm_lapic_expired_hv_timer+0xb5/0xd0 [kvm]
  Call Trace:
  handle_preemption_timer+0xe/0x20 [kvm_intel]
  vmx_handle_exit+0xc9/0x15f0 [kvm_intel]
  ? lock_acquire+0xdb/0x250
  ? lock_acquire+0xdb/0x250
  ? kvm_arch_vcpu_ioctl_run+0xdf3/0x1ce0 [kvm]
  kvm_arch_vcpu_ioctl_run+0xe55/0x1ce0 [kvm]
  kvm_vcpu_ioctl+0x384/0x7b0 [kvm]
  ? kvm_vcpu_ioctl+0x384/0x7b0 [kvm]
  ? __fget+0xf3/0x210
  do_vfs_ioctl+0xa4/0x700
  ? __fget+0x114/0x210
  SyS_ioctl+0x79/0x90
  do_syscall_64+0x8f/0x750
  ? trace_hardirqs_on_thunk+0x1a/0x1c
  entry_SYSCALL64_slow_path+0x25/0x25
 
This can be reproduced sporadically during boot L2 on a preemptible L1, and 
splat on L1.

  CPU0  CPU1 

vmx_cancel_hv_timer
  vCPU0's vmx->hv_deadline_tsc = -1

  preempt occur

 clear preemption timer field in CPU1's 
active vmcs
 vCPU0's apic_timer.hv_timer_in_use = false
vmx_vcpu_run(vCPU0)
  vmx_arm_hv_timer
if (vmx->hv_deadline_tsc == -1)
  nothing change
 
handle_preemption_timer(vCPU0)
  kvm_lapic_expired_hv_timer
WARN_ON(!apic->lapic_timer.hv_timer_in_use); 
  
Preemption can occur during cancel preemption timer, and there will be 
inconsistent 
status in lapic, vmx and vmcs field. This patch fixes it by disable preemption 
for 
cancelling preemption timer.

Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Signed-off-by: Wanpeng Li 
---
 arch/x86/kvm/lapic.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index c329d28..6e6f345 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1495,8 +1495,10 @@ EXPORT_SYMBOL_GPL(kvm_lapic_hv_timer_in_use);
 
 static void cancel_hv_timer(struct kvm_lapic *apic)
 {
+   preempt_disable();
kvm_x86_ops->cancel_hv_timer(apic->vcpu);
apic->lapic_timer.hv_timer_in_use = false;
+   preempt_enable();
 }
 
 static bool start_hv_timer(struct kvm_lapic *apic)
-- 
2.7.4



[PATCH] KVM: X86: Fix preempt the preemption timer cancel

2017-05-20 Thread Wanpeng Li
From: Wanpeng Li 

 WARNING: CPU: 3 PID: 1952 at arch/x86/kvm/lapic.c:1529 
kvm_lapic_expired_hv_timer+0xb5/0xd0 [kvm]
 CPU: 3 PID: 1952 Comm: qemu-system-x86 Not tainted 4.12.0-rc1+ #24 RIP: 
0010:kvm_lapic_expired_hv_timer+0xb5/0xd0 [kvm]
  Call Trace:
  handle_preemption_timer+0xe/0x20 [kvm_intel]
  vmx_handle_exit+0xc9/0x15f0 [kvm_intel]
  ? lock_acquire+0xdb/0x250
  ? lock_acquire+0xdb/0x250
  ? kvm_arch_vcpu_ioctl_run+0xdf3/0x1ce0 [kvm]
  kvm_arch_vcpu_ioctl_run+0xe55/0x1ce0 [kvm]
  kvm_vcpu_ioctl+0x384/0x7b0 [kvm]
  ? kvm_vcpu_ioctl+0x384/0x7b0 [kvm]
  ? __fget+0xf3/0x210
  do_vfs_ioctl+0xa4/0x700
  ? __fget+0x114/0x210
  SyS_ioctl+0x79/0x90
  do_syscall_64+0x8f/0x750
  ? trace_hardirqs_on_thunk+0x1a/0x1c
  entry_SYSCALL64_slow_path+0x25/0x25
 
This can be reproduced sporadically during boot L2 on a preemptible L1, and 
splat on L1.

  CPU0  CPU1 

vmx_cancel_hv_timer
  vCPU0's vmx->hv_deadline_tsc = -1

  preempt occur

 clear preemption timer field in CPU1's 
active vmcs
 vCPU0's apic_timer.hv_timer_in_use = false
vmx_vcpu_run(vCPU0)
  vmx_arm_hv_timer
if (vmx->hv_deadline_tsc == -1)
  nothing change
 
handle_preemption_timer(vCPU0)
  kvm_lapic_expired_hv_timer
WARN_ON(!apic->lapic_timer.hv_timer_in_use); 
  
Preemption can occur during cancel preemption timer, and there will be 
inconsistent 
status in lapic, vmx and vmcs field. This patch fixes it by disable preemption 
for 
cancelling preemption timer.

Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Signed-off-by: Wanpeng Li 
---
 arch/x86/kvm/lapic.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index c329d28..6e6f345 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1495,8 +1495,10 @@ EXPORT_SYMBOL_GPL(kvm_lapic_hv_timer_in_use);
 
 static void cancel_hv_timer(struct kvm_lapic *apic)
 {
+   preempt_disable();
kvm_x86_ops->cancel_hv_timer(apic->vcpu);
apic->lapic_timer.hv_timer_in_use = false;
+   preempt_enable();
 }
 
 static bool start_hv_timer(struct kvm_lapic *apic)
-- 
2.7.4



Re: [RFC] KVM: SVM: do not drop VMCB CPL to 0 if SS is not present

2017-05-20 Thread Andy Lutomirski

On 05/19/2017 09:14 AM, Roman Penyaev wrote:

Hi folks,

After experiencing guest double faults (sometimes triple faults) on
3.16 guest kernels with the following common pattern:

[459395.776124] PANIC: double fault, error_code: 0x0
[459395.776606] CPU: 0 PID: 36565 Comm: fio Not tainted 3.16.39kmemleak #4
[459395.776610] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.7.5-20140531_083029-gandalf 04/01/2014
[459395.776614] task: 880009ca06b0 ti: 88003cbc2000 task.ti:
88003cbc2000
[459395.776617] RIP: 0010:[]  []
__do_page_fault+0x1f/0x540
[459395.776628] RSP: 002b:7ffe0bc9bfa8  EFLAGS: 00010012
[459395.776631] RAX: 81539927 RBX:  RCX:
81539927
[459395.776634] RDX: 0028 RSI:  RDI:
7ffe0bc9c0a8
[459395.776637] RBP: 7ffe0bc9c0a8 R08: 0001a1d1002e9400 R09:
00063f1b
[459395.776640] R10: 33f8356c R11: 29c8250c3103 R12:
0028
[459395.776642] R13: 7ff8c83e R14:  R15:
7ffe0bc9c7c0
[459395.776649] FS:  7ff8d2aaa7c0() GS:88003f40()
knlGS:
[459395.776651] CS:  0010 DS:  ES:  CR0: 80050033
[459395.776656] CR2: 7ffe0bc9bf98 CR3: 3ca46000 CR4:
000407f0
[459395.776658] Stack:
[459395.776661]    

[459395.77]    

[459395.776670]    
[459395.776674] Call Trace:
[459395.776676]  
[459395.776678] Code:
[459395.776680] ad 8c 4e 00 be 04 00 03 00 eb a8 90 66 66 66 66 90 41
57 41 56 41 55 41 54 49 89 d4 55 53 48 89 fd 48 89 f3 48 81 ec c8 00
00 00 <65> 48 8b 04 25 28 00 00 00 48 89 84 24 c0 00 00 00 31 c0 65 48
[459395.776716] Kernel panic - not syncing: Machine halted.
[459395.777172] CPU: 0 PID: 36565 Comm: fio Not tainted 3.16.39kmemleak #4
[459395.777673] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.7.5-20140531_083029-gandalf 04/01/2014
[459395.778373]  0086 d85f6336 81532ec9
8170203e
[459395.779865]  88003f402f18 815318a1 7ff80008
88003f402f28
[459395.780061]  88003f402ec0 d85f6336 7fff0008
0046
[459395.780061] Call Trace:
[459395.780061]  <#DF>  [] ? dump_stack+0x47/0x5a
[459395.780061]  [] ? panic+0xcf/0x206
[459395.780061]  [] ? df_debug+0x2d/0x30
[459395.780061]  [] ? do_double_fault+0x78/0xf0
[459395.780061]  [] ? double_fault+0x22/0x30
[459395.780061]  [] ? native_iret+0x7/0x7
[459395.780061]  [] ? __do_page_fault+0x1f/0x540

we found out that all kernel backtraces have userspace RSP, where
userspace memory has normal timer, page fault or virtio interrupts
trail:

(the following RSP pointer does not belong to this particular crash
  above, but it does not matter, symptoms are always the same)

crash> rd -s 7f6cb9556768 100
 7f6cb9556768:  7f6cfaa21270 7f6cfaa21270
 7f6cb9556778:   7f6cf9b8c6a0
 7f6cb9556788:  7f6cf983399a 
 7f6cb9556798:   7f6cf98a1f2d
 7f6cb95567a8:  7f6cfaa21270 
 7f6cb95567b8:  7f6ca4031880
 ff7e  IRQ,
~0xff7e = 0x81
 7f6cb95567c8:  7f6cfa817ae1 0033  RIP; CS
 7f6cb95567d8:  0202 7f6cb95567f0  EFLAGS; RSP
 7f6cb95567e8:  002b   SS
 7f6cb00318e0
 7f6cb95567f8:  7f6cfa817af5 7f6cac0318e0
 7f6cb9556808:  7f6cfa817af5 7f6cb4031880
 7f6cb9556818:  7f6cfa817af5 7f6cfbe82340
 7f6cb9556828:  7f6cfa817af5 7f6c940318c0
 7f6cb9556838:  7f6cfa817af5 7f6ca00318c0
 7f6cb9556848:  7f6cfa817af5 7f6c9c031920
 7f6cb9556858:  7f6cfa817af5 7f6ca8031920
 7f6cb9556868:  7f6cfa817af5 7f6ca4039df0
 7f6cb9556878:  7f6cfa817af5 7f6cb0039e50
 7f6cb9556888:  7f6cfa817af5 7f6cac039e50

It turned out to be that CPU does not change SS:RSP and took interrupt
on userspace stack (BTW init_tss and gdb_page are not corrupted).
That is completely weird.

Next step was to trace VMCB before and after VMRUN to understand the
exact state seen by real CPU.  VMCB was traced when wrong CPU state is
observed: RIP is to kernel and RSP is from userspace.  The following
is a diff of VMCB, where
--- is the state before VMRUN and
+++ is the state after VMRUN:

  -  event_inj = 0x8081,
  +  event_inj = 0x0,

 ...

 cs = {
  -selector = 0x33,
  -attrib = 0xafb,
  +selector = 0x10,
  +attrib = 0x29b,
   limit = 0x,
   base = 0x0
 },
 ss = {
   selector = 0x2b,
   attrib = 0x0,
   limit = 0x,
   

Re: [RFC] KVM: SVM: do not drop VMCB CPL to 0 if SS is not present

2017-05-20 Thread Andy Lutomirski

On 05/19/2017 09:14 AM, Roman Penyaev wrote:

Hi folks,

After experiencing guest double faults (sometimes triple faults) on
3.16 guest kernels with the following common pattern:

[459395.776124] PANIC: double fault, error_code: 0x0
[459395.776606] CPU: 0 PID: 36565 Comm: fio Not tainted 3.16.39kmemleak #4
[459395.776610] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.7.5-20140531_083029-gandalf 04/01/2014
[459395.776614] task: 880009ca06b0 ti: 88003cbc2000 task.ti:
88003cbc2000
[459395.776617] RIP: 0010:[]  []
__do_page_fault+0x1f/0x540
[459395.776628] RSP: 002b:7ffe0bc9bfa8  EFLAGS: 00010012
[459395.776631] RAX: 81539927 RBX:  RCX:
81539927
[459395.776634] RDX: 0028 RSI:  RDI:
7ffe0bc9c0a8
[459395.776637] RBP: 7ffe0bc9c0a8 R08: 0001a1d1002e9400 R09:
00063f1b
[459395.776640] R10: 33f8356c R11: 29c8250c3103 R12:
0028
[459395.776642] R13: 7ff8c83e R14:  R15:
7ffe0bc9c7c0
[459395.776649] FS:  7ff8d2aaa7c0() GS:88003f40()
knlGS:
[459395.776651] CS:  0010 DS:  ES:  CR0: 80050033
[459395.776656] CR2: 7ffe0bc9bf98 CR3: 3ca46000 CR4:
000407f0
[459395.776658] Stack:
[459395.776661]    

[459395.77]    

[459395.776670]    
[459395.776674] Call Trace:
[459395.776676]  
[459395.776678] Code:
[459395.776680] ad 8c 4e 00 be 04 00 03 00 eb a8 90 66 66 66 66 90 41
57 41 56 41 55 41 54 49 89 d4 55 53 48 89 fd 48 89 f3 48 81 ec c8 00
00 00 <65> 48 8b 04 25 28 00 00 00 48 89 84 24 c0 00 00 00 31 c0 65 48
[459395.776716] Kernel panic - not syncing: Machine halted.
[459395.777172] CPU: 0 PID: 36565 Comm: fio Not tainted 3.16.39kmemleak #4
[459395.777673] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.7.5-20140531_083029-gandalf 04/01/2014
[459395.778373]  0086 d85f6336 81532ec9
8170203e
[459395.779865]  88003f402f18 815318a1 7ff80008
88003f402f28
[459395.780061]  88003f402ec0 d85f6336 7fff0008
0046
[459395.780061] Call Trace:
[459395.780061]  <#DF>  [] ? dump_stack+0x47/0x5a
[459395.780061]  [] ? panic+0xcf/0x206
[459395.780061]  [] ? df_debug+0x2d/0x30
[459395.780061]  [] ? do_double_fault+0x78/0xf0
[459395.780061]  [] ? double_fault+0x22/0x30
[459395.780061]  [] ? native_iret+0x7/0x7
[459395.780061]  [] ? __do_page_fault+0x1f/0x540

we found out that all kernel backtraces have userspace RSP, where
userspace memory has normal timer, page fault or virtio interrupts
trail:

(the following RSP pointer does not belong to this particular crash
  above, but it does not matter, symptoms are always the same)

crash> rd -s 7f6cb9556768 100
 7f6cb9556768:  7f6cfaa21270 7f6cfaa21270
 7f6cb9556778:   7f6cf9b8c6a0
 7f6cb9556788:  7f6cf983399a 
 7f6cb9556798:   7f6cf98a1f2d
 7f6cb95567a8:  7f6cfaa21270 
 7f6cb95567b8:  7f6ca4031880
 ff7e  IRQ,
~0xff7e = 0x81
 7f6cb95567c8:  7f6cfa817ae1 0033  RIP; CS
 7f6cb95567d8:  0202 7f6cb95567f0  EFLAGS; RSP
 7f6cb95567e8:  002b   SS
 7f6cb00318e0
 7f6cb95567f8:  7f6cfa817af5 7f6cac0318e0
 7f6cb9556808:  7f6cfa817af5 7f6cb4031880
 7f6cb9556818:  7f6cfa817af5 7f6cfbe82340
 7f6cb9556828:  7f6cfa817af5 7f6c940318c0
 7f6cb9556838:  7f6cfa817af5 7f6ca00318c0
 7f6cb9556848:  7f6cfa817af5 7f6c9c031920
 7f6cb9556858:  7f6cfa817af5 7f6ca8031920
 7f6cb9556868:  7f6cfa817af5 7f6ca4039df0
 7f6cb9556878:  7f6cfa817af5 7f6cb0039e50
 7f6cb9556888:  7f6cfa817af5 7f6cac039e50

It turned out to be that CPU does not change SS:RSP and took interrupt
on userspace stack (BTW init_tss and gdb_page are not corrupted).
That is completely weird.

Next step was to trace VMCB before and after VMRUN to understand the
exact state seen by real CPU.  VMCB was traced when wrong CPU state is
observed: RIP is to kernel and RSP is from userspace.  The following
is a diff of VMCB, where
--- is the state before VMRUN and
+++ is the state after VMRUN:

  -  event_inj = 0x8081,
  +  event_inj = 0x0,

 ...

 cs = {
  -selector = 0x33,
  -attrib = 0xafb,
  +selector = 0x10,
  +attrib = 0x29b,
   limit = 0x,
   base = 0x0
 },
 ss = {
   selector = 0x2b,
   attrib = 0x0,
   limit = 0x,
   

Re: [PATCH] usb: gadget: f_fs: use memdup_user

2017-05-20 Thread Al Viro
On Sat, May 13, 2017 at 11:05:30AM +0300, Dan Carpenter wrote:

> > +   data = memdup_user(buf, len);
> > +   if (unlikely(IS_ERR(data)))
> 
> Don't use likely/unlikely() here.  It's not a fast path.

More to the point,

#define IS_ERR_VALUE(x) unlikely((unsigned long)(void *)(x) >= (unsigned 
long)-MAX_ERRNO)
static inline bool __must_check IS_ERR(__force const void *ptr)
{
return IS_ERR_VALUE((unsigned long)ptr);
}

IOW, IS_ERR() already produces unlikely(), fast path or not.


Re: [PATCH] usb: gadget: f_fs: use memdup_user

2017-05-20 Thread Al Viro
On Sat, May 13, 2017 at 11:05:30AM +0300, Dan Carpenter wrote:

> > +   data = memdup_user(buf, len);
> > +   if (unlikely(IS_ERR(data)))
> 
> Don't use likely/unlikely() here.  It's not a fast path.

More to the point,

#define IS_ERR_VALUE(x) unlikely((unsigned long)(void *)(x) >= (unsigned 
long)-MAX_ERRNO)
static inline bool __must_check IS_ERR(__force const void *ptr)
{
return IS_ERR_VALUE((unsigned long)ptr);
}

IOW, IS_ERR() already produces unlikely(), fast path or not.


Re: [PATCH v3 08/10] x86/hyper-v: use hypercall for remote TLB flush

2017-05-20 Thread Andy Lutomirski

On 05/19/2017 07:09 AM, Vitaly Kuznetsov wrote:

Hyper-V host can suggest us to use hypercall for doing remote TLB flush,
this is supposed to work faster than IPIs.

Implementation details: to do HvFlushVirtualAddress{Space,List} hypercalls
we need to put the input somewhere in memory and we don't really want to
have memory allocation on each call so we pre-allocate per cpu memory areas
on boot. These areas are of fixes size, limit them with an arbitrary number
of 16 (16 gvas are able to specify 16 * 4096 pages).

pv_ops patching is happening very early so we need to separate
hyperv_setup_mmu_ops() and hyper_alloc_mmu().

It is possible and easy to implement local TLB flushing too and there is
even a hint for that. However, I don't see a room for optimization on the
host side as both hypercall and native tlb flush will result in vmexit. The
hint is also not set on modern Hyper-V versions.


Why do local flushes exit?


+static void hyperv_flush_tlb_others(const struct cpumask *cpus,
+   struct mm_struct *mm, unsigned long start,
+   unsigned long end)
+{


What tree will this go through?  I'm about to send a signature change 
for this function for tip:x86/mm.


Also, how would this interact with PCID?  I have PCID patches that I'm 
pretty happy with now, and I'm hoping to support PCID in 4.13.


Re: [PATCH v3 08/10] x86/hyper-v: use hypercall for remote TLB flush

2017-05-20 Thread Andy Lutomirski

On 05/19/2017 07:09 AM, Vitaly Kuznetsov wrote:

Hyper-V host can suggest us to use hypercall for doing remote TLB flush,
this is supposed to work faster than IPIs.

Implementation details: to do HvFlushVirtualAddress{Space,List} hypercalls
we need to put the input somewhere in memory and we don't really want to
have memory allocation on each call so we pre-allocate per cpu memory areas
on boot. These areas are of fixes size, limit them with an arbitrary number
of 16 (16 gvas are able to specify 16 * 4096 pages).

pv_ops patching is happening very early so we need to separate
hyperv_setup_mmu_ops() and hyper_alloc_mmu().

It is possible and easy to implement local TLB flushing too and there is
even a hint for that. However, I don't see a room for optimization on the
host side as both hypercall and native tlb flush will result in vmexit. The
hint is also not set on modern Hyper-V versions.


Why do local flushes exit?


+static void hyperv_flush_tlb_others(const struct cpumask *cpus,
+   struct mm_struct *mm, unsigned long start,
+   unsigned long end)
+{


What tree will this go through?  I'm about to send a signature change 
for this function for tip:x86/mm.


Also, how would this interact with PCID?  I have PCID patches that I'm 
pretty happy with now, and I'm hoping to support PCID in 4.13.


[GIT PULL] tracing: Fixes for 4.12-rc1

2017-05-20 Thread Steven Rostedt

Linus,

This fixes a bug caused by not cleaning up the new instance unique triggers
when deleting an instance. It also creates a selftest that triggers that bug.

Fix the delayed optimization happening after kprobes boot up self tests
being removed by freeing of init memory.

Comment kprobes on why the delay optimization is not a problem for removal
of modules, to keep other developers from searching that riddle.

Fix another rcu isn't watching in stack trace tracing.

Naveen N. Rao (4):
  ftrace: Simplify glob handling in unregister_ftrace_function_probe_func()
  ftrace/instances: Clear function triggers when removing instances
  selftests/ftrace: Fix bashisms
  selftests/ftrace: Add test to remove instance with active event triggers

Steven Rostedt (1):
  tracing: Move postpone selftests to core from early_initcall

Steven Rostedt (VMware) (3):
  ftrace: Remove #ifdef from code and add clear_ftrace_function_probes() 
stub
  kprobes: Document how optimized kprobes are removed from module unload
  tracing: Make sure RCU is watching before calling a stack trace

Thomas Gleixner (1):
  tracing/kprobes: Enforce kprobes teardown after testing


Please pull the latest trace-v4.12-rc1 tree, which can be found at:


  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
trace-v4.12-rc1

Tag SHA1: 7edb887bbfb5c2b3556dca9274b68be9bb924c3e
Head SHA1: a33d7d94eed92b23fbbc7b0de06a41b2bbaa49e3


Naveen N. Rao (4):
  ftrace: Simplify glob handling in unregister_ftrace_function_probe_func()
  ftrace/instances: Clear function triggers when removing instances
  selftests/ftrace: Fix bashisms
  selftests/ftrace: Add test to remove instance with active event triggers

Steven Rostedt (1):
  tracing: Move postpone selftests to core from early_initcall

Steven Rostedt (VMware) (3):
  ftrace: Remove #ifdef from code and add clear_ftrace_function_probes() 
stub
  kprobes: Document how optimized kprobes are removed from module unload
  tracing: Make sure RCU is watching before calling a stack trace

Thomas Gleixner (1):
  tracing/kprobes: Enforce kprobes teardown after testing


 include/linux/kprobes.h|  3 ++
 kernel/kprobes.c   |  8 -
 kernel/trace/ftrace.c  | 12 ++--
 kernel/trace/trace.c   | 34 --
 kernel/trace/trace.h   |  5 
 kernel/trace/trace_kprobe.c|  5 
 tools/testing/selftests/ftrace/ftracetest  |  2 +-
 .../ftrace/test.d/ftrace/func_event_triggers.tc|  2 +-
 tools/testing/selftests/ftrace/test.d/functions|  4 +--
 .../ftrace/test.d/instances/instance-event.tc  |  8 +++--
 10 files changed, 72 insertions(+), 11 deletions(-)
---
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 30f90c1a0aaf..541df0b5b815 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -349,6 +349,9 @@ extern int proc_kprobes_optimization_handler(struct 
ctl_table *table,
 int write, void __user *buffer,
 size_t *length, loff_t *ppos);
 #endif
+extern void wait_for_kprobe_optimizer(void);
+#else
+static inline void wait_for_kprobe_optimizer(void) { }
 #endif /* CONFIG_OPTPROBES */
 #ifdef CONFIG_KPROBES_ON_FTRACE
 extern void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 7367e0ec6f81..2d2d3a568e4e 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -595,7 +595,7 @@ static void kprobe_optimizer(struct work_struct *work)
 }
 
 /* Wait for completing optimization and unoptimization */
-static void wait_for_kprobe_optimizer(void)
+void wait_for_kprobe_optimizer(void)
 {
mutex_lock(_mutex);
 
@@ -2183,6 +2183,12 @@ static int kprobes_module_callback(struct notifier_block 
*nb,
 * The vaddr this probe is installed will soon
 * be vfreed buy not synced to disk. Hence,
 * disarming the breakpoint isn't needed.
+*
+* Note, this will also move any optimized 
probes
+* that are pending to be removed from their
+* corresponding lists to the freeing_list and
+* will not be touched by the delayed
+* kprobe_optimizer work handler.
 */
kill_kprobe(p);
}
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 39dca4e86a94..74fdfe9ed3db 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -4144,9 +4144,9 @@ 

[GIT PULL] tracing: Fixes for 4.12-rc1

2017-05-20 Thread Steven Rostedt

Linus,

This fixes a bug caused by not cleaning up the new instance unique triggers
when deleting an instance. It also creates a selftest that triggers that bug.

Fix the delayed optimization happening after kprobes boot up self tests
being removed by freeing of init memory.

Comment kprobes on why the delay optimization is not a problem for removal
of modules, to keep other developers from searching that riddle.

Fix another rcu isn't watching in stack trace tracing.

Naveen N. Rao (4):
  ftrace: Simplify glob handling in unregister_ftrace_function_probe_func()
  ftrace/instances: Clear function triggers when removing instances
  selftests/ftrace: Fix bashisms
  selftests/ftrace: Add test to remove instance with active event triggers

Steven Rostedt (1):
  tracing: Move postpone selftests to core from early_initcall

Steven Rostedt (VMware) (3):
  ftrace: Remove #ifdef from code and add clear_ftrace_function_probes() 
stub
  kprobes: Document how optimized kprobes are removed from module unload
  tracing: Make sure RCU is watching before calling a stack trace

Thomas Gleixner (1):
  tracing/kprobes: Enforce kprobes teardown after testing


Please pull the latest trace-v4.12-rc1 tree, which can be found at:


  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
trace-v4.12-rc1

Tag SHA1: 7edb887bbfb5c2b3556dca9274b68be9bb924c3e
Head SHA1: a33d7d94eed92b23fbbc7b0de06a41b2bbaa49e3


Naveen N. Rao (4):
  ftrace: Simplify glob handling in unregister_ftrace_function_probe_func()
  ftrace/instances: Clear function triggers when removing instances
  selftests/ftrace: Fix bashisms
  selftests/ftrace: Add test to remove instance with active event triggers

Steven Rostedt (1):
  tracing: Move postpone selftests to core from early_initcall

Steven Rostedt (VMware) (3):
  ftrace: Remove #ifdef from code and add clear_ftrace_function_probes() 
stub
  kprobes: Document how optimized kprobes are removed from module unload
  tracing: Make sure RCU is watching before calling a stack trace

Thomas Gleixner (1):
  tracing/kprobes: Enforce kprobes teardown after testing


 include/linux/kprobes.h|  3 ++
 kernel/kprobes.c   |  8 -
 kernel/trace/ftrace.c  | 12 ++--
 kernel/trace/trace.c   | 34 --
 kernel/trace/trace.h   |  5 
 kernel/trace/trace_kprobe.c|  5 
 tools/testing/selftests/ftrace/ftracetest  |  2 +-
 .../ftrace/test.d/ftrace/func_event_triggers.tc|  2 +-
 tools/testing/selftests/ftrace/test.d/functions|  4 +--
 .../ftrace/test.d/instances/instance-event.tc  |  8 +++--
 10 files changed, 72 insertions(+), 11 deletions(-)
---
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 30f90c1a0aaf..541df0b5b815 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -349,6 +349,9 @@ extern int proc_kprobes_optimization_handler(struct 
ctl_table *table,
 int write, void __user *buffer,
 size_t *length, loff_t *ppos);
 #endif
+extern void wait_for_kprobe_optimizer(void);
+#else
+static inline void wait_for_kprobe_optimizer(void) { }
 #endif /* CONFIG_OPTPROBES */
 #ifdef CONFIG_KPROBES_ON_FTRACE
 extern void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 7367e0ec6f81..2d2d3a568e4e 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -595,7 +595,7 @@ static void kprobe_optimizer(struct work_struct *work)
 }
 
 /* Wait for completing optimization and unoptimization */
-static void wait_for_kprobe_optimizer(void)
+void wait_for_kprobe_optimizer(void)
 {
mutex_lock(_mutex);
 
@@ -2183,6 +2183,12 @@ static int kprobes_module_callback(struct notifier_block 
*nb,
 * The vaddr this probe is installed will soon
 * be vfreed buy not synced to disk. Hence,
 * disarming the breakpoint isn't needed.
+*
+* Note, this will also move any optimized 
probes
+* that are pending to be removed from their
+* corresponding lists to the freeing_list and
+* will not be touched by the delayed
+* kprobe_optimizer work handler.
 */
kill_kprobe(p);
}
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 39dca4e86a94..74fdfe9ed3db 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -4144,9 +4144,9 @@ 

Re: [PATCH v3 04/10] x86/hyper-v: fast hypercall implementation

2017-05-20 Thread Andy Lutomirski

On 05/19/2017 07:09 AM, Vitaly Kuznetsov wrote:

Hyper-V supports 'fast' hypercalls when all parameters are passed through
registers. Implement an inline version of a simpliest of these calls:
hypercall with one 8-byte input and no output.

Proper hypercall input interface (struct hv_hypercall_input) definition is
added as well.

Signed-off-by: Vitaly Kuznetsov 
Acked-by: K. Y. Srinivasan 
Tested-by: Simon Xiao 
Tested-by: Srikanth Myakam 
---
  arch/x86/include/asm/mshyperv.h| 39 ++
  arch/x86/include/uapi/asm/hyperv.h | 19 +++
  2 files changed, 58 insertions(+)

diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index e293937..028e29b 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -216,6 +216,45 @@ static inline u64 hv_do_hypercall(u64 control, void 
*input, void *output)
  #endif /* !x86_64 */
  }
  
+/* Fast hypercall with 8 bytes of input and no output */

+static inline u64 hv_do_fast_hypercall8(u16 code, u64 input1)
+{
+   union hv_hypercall_input control = {0};
+
+   control.code = code;
+   control.fast = 1;
+#ifdef CONFIG_X86_64
+   {
+   u64 hv_status;
+
+   __asm__ __volatile__("call *%3"
+: "=a" (hv_status),
+  "+c" (control.as_uint64), "+d" (input1)
+: "m" (hv_hypercall_pg)
+: "cc", "r8", "r9", "r10", "r11");
+   return hv_status;
+   }
+#else
+   {
+   u32 hv_status_hi, hv_status_lo;
+   u32 input1_hi = (u32)(input1 >> 32);
+   u32 input1_lo = (u32)input1;
+
+   __asm__ __volatile__ ("call *%6"
+ : "=d"(hv_status_hi),
+   "=a"(hv_status_lo),
+   "+c"(input1_lo)
+ : "d" (control.as_uint32_hi),
+   "a" (control.as_uint32_lo),
+   "b" (input1_hi),
+   "m" (hv_hypercall_pg)
+ : "cc", "edi", "esi");
+
+   return hv_status_lo | ((u64)hv_status_hi << 32);
+   }
+#endif


This is going to need an explicit "sp" annotation to force a stack 
frame, I think.  Otherwise objtool is likely to get mad in a 
frame-pointer-omitted build.


Re: [PATCH v3 04/10] x86/hyper-v: fast hypercall implementation

2017-05-20 Thread Andy Lutomirski

On 05/19/2017 07:09 AM, Vitaly Kuznetsov wrote:

Hyper-V supports 'fast' hypercalls when all parameters are passed through
registers. Implement an inline version of a simpliest of these calls:
hypercall with one 8-byte input and no output.

Proper hypercall input interface (struct hv_hypercall_input) definition is
added as well.

Signed-off-by: Vitaly Kuznetsov 
Acked-by: K. Y. Srinivasan 
Tested-by: Simon Xiao 
Tested-by: Srikanth Myakam 
---
  arch/x86/include/asm/mshyperv.h| 39 ++
  arch/x86/include/uapi/asm/hyperv.h | 19 +++
  2 files changed, 58 insertions(+)

diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index e293937..028e29b 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -216,6 +216,45 @@ static inline u64 hv_do_hypercall(u64 control, void 
*input, void *output)
  #endif /* !x86_64 */
  }
  
+/* Fast hypercall with 8 bytes of input and no output */

+static inline u64 hv_do_fast_hypercall8(u16 code, u64 input1)
+{
+   union hv_hypercall_input control = {0};
+
+   control.code = code;
+   control.fast = 1;
+#ifdef CONFIG_X86_64
+   {
+   u64 hv_status;
+
+   __asm__ __volatile__("call *%3"
+: "=a" (hv_status),
+  "+c" (control.as_uint64), "+d" (input1)
+: "m" (hv_hypercall_pg)
+: "cc", "r8", "r9", "r10", "r11");
+   return hv_status;
+   }
+#else
+   {
+   u32 hv_status_hi, hv_status_lo;
+   u32 input1_hi = (u32)(input1 >> 32);
+   u32 input1_lo = (u32)input1;
+
+   __asm__ __volatile__ ("call *%6"
+ : "=d"(hv_status_hi),
+   "=a"(hv_status_lo),
+   "+c"(input1_lo)
+ : "d" (control.as_uint32_hi),
+   "a" (control.as_uint32_lo),
+   "b" (input1_hi),
+   "m" (hv_hypercall_pg)
+ : "cc", "edi", "esi");
+
+   return hv_status_lo | ((u64)hv_status_hi << 32);
+   }
+#endif


This is going to need an explicit "sp" annotation to force a stack 
frame, I think.  Otherwise objtool is likely to get mad in a 
frame-pointer-omitted build.


Re: [PATCH] ext4: keep existing extra fields when inode expands

2017-05-20 Thread Theodore Ts'o
On Fri, May 19, 2017 at 10:13:39AM +0300, Konstantin Khlebnikov wrote:
> ext4_expand_extra_isize() should clear only space between old and new size.
> 
> Signed-off-by: Konstantin Khlebnikov 

Thanks, applied.

- Ted


Re: [PATCH] ext4: keep existing extra fields when inode expands

2017-05-20 Thread Theodore Ts'o
On Fri, May 19, 2017 at 10:13:39AM +0300, Konstantin Khlebnikov wrote:
> ext4_expand_extra_isize() should clear only space between old and new size.
> 
> Signed-off-by: Konstantin Khlebnikov 

Thanks, applied.

- Ted


Re: [v3, 2/2] initramfs: Allow again choice of the embedded initram compression algorithm

2017-05-20 Thread Florian Fainelli


On 05/20/2017 03:30 PM, Florian Fainelli wrote:
> Hi Francisco, Nicholas,
> 
> Nicholas already fixed part of this commit, but there is more breakage,
> see below:
> 
> On 09/27/2016 01:32 PM, klondike wrote:
>> Choosing the appropriate compression option when using an embeded initramfs
>> can result in significant size differences in the resulting data.
>>
>> This is caused by avoiding double compression of the initramfs contents.
>> For example on my tests, choosing CONFIG_INITRAMFS_COMPRESSION_NONE when
>> compressing the kernel using XZ) results in up to 500KiB differences (9MiB to
>>  8.5MiB) in the kernel size as the dictionary will not get polluted with
>> uncomprensible data and may reuse kernel data too.
>>
>> Despite embedding an uncompressed initramfs, a user may want to allow for a
>> compressed extra initramfs to be passed using the rd system, for example to
>> boot a recovery system. Commit 9ba4bcb645898d562498ea66a0df958ef0e7a68c
>> ("initramfs: read CONFIG_RD_ variables for initramfs compression") broke
>> that behavior by making the choice based on CONFIG_RD_* instead of adding
>> CONFIG_INITRAMFS_COMPRESSION_LZ4. Saddly, CONFIG_RD_* is also used to
>> choose the supported RD compression algorithms by the kernel and a user may
>> want to suppport more than one.
>>
>> This patch also reverses 3e4e0f0a8756dade3023d1f47d50fbced7749788
>> ("initramfs: remove "compression mode" choice") restoring back the
>> "compression mode" choice and includes the CONFIG_INITRAMFS_COMPRESSION_LZ4
>> option which was never added.
>>
>> As a result the following options are added or readed affecting the embedded
>> initramfs compression:
>> INITRAMFS_COMPRESSION_NONE Do no compression
>> INITRAMFS_COMPRESSION_GZIP Compress using gzip
>> INITRAMFS_COMPRESSION_BZIP2 Compress using bzip2
>> INITRAMFS_COMPRESSION_LZMA Compress using lzma
>> INITRAMFS_COMPRESSION_XZ Compress using xz
>> INITRAMFS_COMPRESSION_LZO Compress using lzo
>> INITRAMFS_COMPRESSION_LZ4 Compress using lz4
>>
>> These depend on the corresponding CONFIG_RD_* option being set (except NONE
>> which has no dependencies).
>>
>> This patch depends on the previous one (the previous version didn't) to
>> simplify the way in which the algorithm is chosen and keep backwards
>> compatibility with the behaviour introduced by commit
>>  9ba4bcb645898d562498ea66a0df958ef0e7a68c
>>
>> Signed-off-by: Francisco Blas Izquierdo Riera (klondike) 
>> 
>> Cc: P J P 
>> Cc: Paul Bolle 
>> Cc: Andrew Morton 
> 
> Running a bisection against usr/ was not particularly convincing but
> here is basically what I am observing which used to work just fine
> before as of v4.9 and since I tracked it to this particular
> commit/patch. Here is what my build system does:
> 
> - kernel is initially configured not to have an initramfs included
> - build the user space root file system
> - re-configure the kernel to have an initramfs included
> (CONFIG_INITRAMFS_SOURCE="/path/to/romfs") and set relevant
> CONFIG_INITRAMFS options, in my case, no compression
> - kernel is re-built with these options -> kernel+initramfs image is copied
> - kernel is re-built again without these options -> kernel image is copied
> 
> Now suppose you make changes to your root filesystem, like add/remove
> applications, initramfs_data.cpio is now a stale file and go through the
> same process again:
> 
> - build the kernel without an initramfs
> - user space (re)build
> - build the kernel with an initramfs
> 
> Building a kernel without an initramfs means setting this option:
> 
> CONFIG_INITRAMFS_SOURCE=""
> 
> whereas building a kernel with an initramfs means setting these options:
> 
> CONFIG_INITRAMFS_SOURCE="/home/fainelli/work/uclinux-rootfs/romfs
> /home/fainelli/work/uclinux-rootfs/misc/initramfs.dev"
> CONFIG_INITRAMFS_ROOT_UID=1000
> CONFIG_INITRAMFS_ROOT_GID=1000
> CONFIG_INITRAMFS_COMPRESSION_NONE=y
> # CONFIG_INITRAMFS_COMPRESSION_GZIP is not set
> # CONFIG_INITRAMFS_COMPRESSION_BZIP2 is not set
> # CONFIG_INITRAMFS_COMPRESSION_LZMA is not set
> # CONFIG_INITRAMFS_COMPRESSION_XZ is not set
> # CONFIG_INITRAMFS_COMPRESSION_LZO is not set
> # CONFIG_INITRAMFS_COMPRESSION_LZ4 is not set
> CONFIG_INITRAMFS_COMPRESSION=""
> 
> Commit db2aa7fd15e857891cefbada8348c8d938c7a2bc ("initramfs: allow again
> choice of the embedded initram compression algorithm") appears
> problematic because CONFIG_INITRAMFS_COMPRESSION which is used to
> determine the initramfs cpio extension/compression is a string, and due
> to how Kconfig works, it will evaluate, in order, how to assign it.
> Setting CONFIG_INITRAMFS_COMPRESSION_NONE with
> CONFIG_INITRAMFS_SOURCE="" cannot possibly work (because of the depends
> on INITRAMFS_SOURCE!="") yet we still manage to get it assigned to
> something: ".gz" because CONFIG_RD_GZIP=y is set in my kernel, even when
> there is no initramfs being built.
> 
> So we basically end-up 

Re: [v3, 2/2] initramfs: Allow again choice of the embedded initram compression algorithm

2017-05-20 Thread Florian Fainelli


On 05/20/2017 03:30 PM, Florian Fainelli wrote:
> Hi Francisco, Nicholas,
> 
> Nicholas already fixed part of this commit, but there is more breakage,
> see below:
> 
> On 09/27/2016 01:32 PM, klondike wrote:
>> Choosing the appropriate compression option when using an embeded initramfs
>> can result in significant size differences in the resulting data.
>>
>> This is caused by avoiding double compression of the initramfs contents.
>> For example on my tests, choosing CONFIG_INITRAMFS_COMPRESSION_NONE when
>> compressing the kernel using XZ) results in up to 500KiB differences (9MiB to
>>  8.5MiB) in the kernel size as the dictionary will not get polluted with
>> uncomprensible data and may reuse kernel data too.
>>
>> Despite embedding an uncompressed initramfs, a user may want to allow for a
>> compressed extra initramfs to be passed using the rd system, for example to
>> boot a recovery system. Commit 9ba4bcb645898d562498ea66a0df958ef0e7a68c
>> ("initramfs: read CONFIG_RD_ variables for initramfs compression") broke
>> that behavior by making the choice based on CONFIG_RD_* instead of adding
>> CONFIG_INITRAMFS_COMPRESSION_LZ4. Saddly, CONFIG_RD_* is also used to
>> choose the supported RD compression algorithms by the kernel and a user may
>> want to suppport more than one.
>>
>> This patch also reverses 3e4e0f0a8756dade3023d1f47d50fbced7749788
>> ("initramfs: remove "compression mode" choice") restoring back the
>> "compression mode" choice and includes the CONFIG_INITRAMFS_COMPRESSION_LZ4
>> option which was never added.
>>
>> As a result the following options are added or readed affecting the embedded
>> initramfs compression:
>> INITRAMFS_COMPRESSION_NONE Do no compression
>> INITRAMFS_COMPRESSION_GZIP Compress using gzip
>> INITRAMFS_COMPRESSION_BZIP2 Compress using bzip2
>> INITRAMFS_COMPRESSION_LZMA Compress using lzma
>> INITRAMFS_COMPRESSION_XZ Compress using xz
>> INITRAMFS_COMPRESSION_LZO Compress using lzo
>> INITRAMFS_COMPRESSION_LZ4 Compress using lz4
>>
>> These depend on the corresponding CONFIG_RD_* option being set (except NONE
>> which has no dependencies).
>>
>> This patch depends on the previous one (the previous version didn't) to
>> simplify the way in which the algorithm is chosen and keep backwards
>> compatibility with the behaviour introduced by commit
>>  9ba4bcb645898d562498ea66a0df958ef0e7a68c
>>
>> Signed-off-by: Francisco Blas Izquierdo Riera (klondike) 
>> 
>> Cc: P J P 
>> Cc: Paul Bolle 
>> Cc: Andrew Morton 
> 
> Running a bisection against usr/ was not particularly convincing but
> here is basically what I am observing which used to work just fine
> before as of v4.9 and since I tracked it to this particular
> commit/patch. Here is what my build system does:
> 
> - kernel is initially configured not to have an initramfs included
> - build the user space root file system
> - re-configure the kernel to have an initramfs included
> (CONFIG_INITRAMFS_SOURCE="/path/to/romfs") and set relevant
> CONFIG_INITRAMFS options, in my case, no compression
> - kernel is re-built with these options -> kernel+initramfs image is copied
> - kernel is re-built again without these options -> kernel image is copied
> 
> Now suppose you make changes to your root filesystem, like add/remove
> applications, initramfs_data.cpio is now a stale file and go through the
> same process again:
> 
> - build the kernel without an initramfs
> - user space (re)build
> - build the kernel with an initramfs
> 
> Building a kernel without an initramfs means setting this option:
> 
> CONFIG_INITRAMFS_SOURCE=""
> 
> whereas building a kernel with an initramfs means setting these options:
> 
> CONFIG_INITRAMFS_SOURCE="/home/fainelli/work/uclinux-rootfs/romfs
> /home/fainelli/work/uclinux-rootfs/misc/initramfs.dev"
> CONFIG_INITRAMFS_ROOT_UID=1000
> CONFIG_INITRAMFS_ROOT_GID=1000
> CONFIG_INITRAMFS_COMPRESSION_NONE=y
> # CONFIG_INITRAMFS_COMPRESSION_GZIP is not set
> # CONFIG_INITRAMFS_COMPRESSION_BZIP2 is not set
> # CONFIG_INITRAMFS_COMPRESSION_LZMA is not set
> # CONFIG_INITRAMFS_COMPRESSION_XZ is not set
> # CONFIG_INITRAMFS_COMPRESSION_LZO is not set
> # CONFIG_INITRAMFS_COMPRESSION_LZ4 is not set
> CONFIG_INITRAMFS_COMPRESSION=""
> 
> Commit db2aa7fd15e857891cefbada8348c8d938c7a2bc ("initramfs: allow again
> choice of the embedded initram compression algorithm") appears
> problematic because CONFIG_INITRAMFS_COMPRESSION which is used to
> determine the initramfs cpio extension/compression is a string, and due
> to how Kconfig works, it will evaluate, in order, how to assign it.
> Setting CONFIG_INITRAMFS_COMPRESSION_NONE with
> CONFIG_INITRAMFS_SOURCE="" cannot possibly work (because of the depends
> on INITRAMFS_SOURCE!="") yet we still manage to get it assigned to
> something: ".gz" because CONFIG_RD_GZIP=y is set in my kernel, even when
> there is no initramfs being built.
> 
> So we basically end-up generating two initramfs_data.cpio* files, one
> without extension, and one with .gz. This causes 

Re: [PATCH] ext4: handle the rest of ext4_mb_load_buddy() ENOMEM errors

2017-05-20 Thread Theodore Ts'o
On Fri, May 19, 2017 at 10:09:54AM +0300, Konstantin Khlebnikov wrote:
> I've got another report about breaking ext4 by ENOMEM error returned from
> ext4_mb_load_buddy() caused by memory shortage in memory cgroup.
> This time inside ext4_discard_preallocations().
> 
> This patch replaces ext4_error() with ext4_warning() where errors returned
> from ext4_mb_load_buddy() are not fatal and handled by caller:
> * ext4_mb_discard_group_preallocations() - called before generating ENOSPC,
>   we'll try to discard other group or return ENOSPC into user-space.
> * ext4_trim_all_free() - just stop trimming and return ENOMEM from ioctl.
> 
> Some callers cannot handle errors, thus __GFP_NOFAIL is used for them:
> * ext4_discard_preallocations()
> * ext4_mb_discard_lg_preallocations()
> 
> The only unclear case is ext4_group_add_blocks(), probably ext4_std_error()
> should handle ENOMEM as warning and don't break filesystem.
> 
> Fixes: adb7ef600cc9 ("ext4: use __GFP_NOFAIL in ext4_free_blocks()")
> Signed-off-by: Konstantin Khlebnikov 

Thanks, applied.

- Ted


Re: [PATCH] ext4: handle the rest of ext4_mb_load_buddy() ENOMEM errors

2017-05-20 Thread Theodore Ts'o
On Fri, May 19, 2017 at 10:09:54AM +0300, Konstantin Khlebnikov wrote:
> I've got another report about breaking ext4 by ENOMEM error returned from
> ext4_mb_load_buddy() caused by memory shortage in memory cgroup.
> This time inside ext4_discard_preallocations().
> 
> This patch replaces ext4_error() with ext4_warning() where errors returned
> from ext4_mb_load_buddy() are not fatal and handled by caller:
> * ext4_mb_discard_group_preallocations() - called before generating ENOSPC,
>   we'll try to discard other group or return ENOSPC into user-space.
> * ext4_trim_all_free() - just stop trimming and return ENOMEM from ioctl.
> 
> Some callers cannot handle errors, thus __GFP_NOFAIL is used for them:
> * ext4_discard_preallocations()
> * ext4_mb_discard_lg_preallocations()
> 
> The only unclear case is ext4_group_add_blocks(), probably ext4_std_error()
> should handle ENOMEM as warning and don't break filesystem.
> 
> Fixes: adb7ef600cc9 ("ext4: use __GFP_NOFAIL in ext4_free_blocks()")
> Signed-off-by: Konstantin Khlebnikov 

Thanks, applied.

- Ted


Re: [PATCH 0/3] arm64: dts: introduce QorIQ DPAA 1.x FMan device tree nodes

2017-05-20 Thread Shawn Guo
On Tue, May 16, 2017 at 03:07:20PM +0300, Madalin Bucur wrote:
> This patch set introduces the QorIQ Data Path Acceleration Arhitecture
> (DPAA) Frame Manager device tree nodes for the ARM based DPAA 1.x platforms.
> 
> Madalin Bucur (3):
>   arm64: dts: add DPAA FMan nodes
>   arm64: dts: add LS1043A DPAA FMan support
>   arm64: dts: add LS1046A DPAA FMan nodes

Applied all, thanks.


Re: [PATCH 0/3] arm64: dts: introduce QorIQ DPAA 1.x FMan device tree nodes

2017-05-20 Thread Shawn Guo
On Tue, May 16, 2017 at 03:07:20PM +0300, Madalin Bucur wrote:
> This patch set introduces the QorIQ Data Path Acceleration Arhitecture
> (DPAA) Frame Manager device tree nodes for the ARM based DPAA 1.x platforms.
> 
> Madalin Bucur (3):
>   arm64: dts: add DPAA FMan nodes
>   arm64: dts: add LS1043A DPAA FMan support
>   arm64: dts: add LS1046A DPAA FMan nodes

Applied all, thanks.


Re: [v4 1/1] mm: Adaptive hash table scaling

2017-05-20 Thread Andi Kleen
Pavel Tatashin  writes:

> Allow hash tables to scale with memory but at slower pace, when HASH_ADAPT
> is provided every time memory quadruples the sizes of hash tables will only
> double instead of quadrupling as well. This algorithm starts working only
> when memory size reaches a certain point, currently set to 64G.
>
> This is example of dentry hash table size, before and after four various
> memory configurations:

IMHO the scale is still too aggressive. I find it very unlikely
that a 1TB machine really needs 256MB of hash table because
number of used files are unlikely to directly scale with memory.

Perhaps should just cap it at some large size, e.g. 32M

-Andi


Re: [v4 1/1] mm: Adaptive hash table scaling

2017-05-20 Thread Andi Kleen
Pavel Tatashin  writes:

> Allow hash tables to scale with memory but at slower pace, when HASH_ADAPT
> is provided every time memory quadruples the sizes of hash tables will only
> double instead of quadrupling as well. This algorithm starts working only
> when memory size reaches a certain point, currently set to 64G.
>
> This is example of dentry hash table size, before and after four various
> memory configurations:

IMHO the scale is still too aggressive. I find it very unlikely
that a 1TB machine really needs 256MB of hash table because
number of used files are unlikely to directly scale with memory.

Perhaps should just cap it at some large size, e.g. 32M

-Andi


Re: [PATCH] ARM: dts: imx7: use 3 PWM cells

2017-05-20 Thread Shawn Guo
On Tue, May 16, 2017 at 12:40:13AM -0700, Stefan Agner wrote:
> The PWM driver has now capability to specify the PWM polarity
> which is e.g. for backlight control. Allow to make use of PWM
> polarity by specifying pwm-cells to be 3 in the base dt.
> 
> Signed-off-by: Stefan Agner 

Applied, thanks.


Re: [PATCH] ARM: dts: imx7: use 3 PWM cells

2017-05-20 Thread Shawn Guo
On Tue, May 16, 2017 at 12:40:13AM -0700, Stefan Agner wrote:
> The PWM driver has now capability to specify the PWM polarity
> which is e.g. for backlight control. Allow to make use of PWM
> polarity by specifying pwm-cells to be 3 in the base dt.
> 
> Signed-off-by: Stefan Agner 

Applied, thanks.


Re: [PATCH v3 0/7] i.MX7 PCIe related device tree changes

2017-05-20 Thread Shawn Guo
On Mon, May 15, 2017 at 07:52:58AM -0700, Andrey Smirnov wrote:
> Andrey Smirnov (7):
>   ARM: dts: imx: Reintroduce 'anatop-enable-bit' where appropriate
>   ARM: imx: Select GPCv2 for i.MX7
>   ARM: dts: imx7s: Add node for GPC
>   ARM: dts: imx7s: Mark 'gpr' compatible with i.MX6 variant
>   ARM: dts: imx7d-sdb: Add GPIO expander node
>   ARM: dts: imx7d: Add node for PCIe controller
>   ARM: dts: imx7d-sdb: Enable PCIe peripheral

Applied all, thanks.


Re: [PATCH v3 0/7] i.MX7 PCIe related device tree changes

2017-05-20 Thread Shawn Guo
On Mon, May 15, 2017 at 07:52:58AM -0700, Andrey Smirnov wrote:
> Andrey Smirnov (7):
>   ARM: dts: imx: Reintroduce 'anatop-enable-bit' where appropriate
>   ARM: imx: Select GPCv2 for i.MX7
>   ARM: dts: imx7s: Add node for GPC
>   ARM: dts: imx7s: Mark 'gpr' compatible with i.MX6 variant
>   ARM: dts: imx7d-sdb: Add GPIO expander node
>   ARM: dts: imx7d: Add node for PCIe controller
>   ARM: dts: imx7d-sdb: Enable PCIe peripheral

Applied all, thanks.


[PATCH v9] mm: Add memory allocation watchdog kernel thread.

2017-05-20 Thread Tetsuo Handa
This patch adds a watchdog which periodically reports number of memory
allocating tasks, dying tasks and OOM victim tasks when some task is
spending too long time inside __alloc_pages_slowpath(). This patch also
serves as a hook for obtaining additional information using SystemTap
(e.g. examine other variables using printk(), capture a crash dump by
calling panic()) by triggering a callback only when a stall is detected.
Ability to take administrator-controlled actions based on some threshold
is a big advantage gained by introducing a state tracking.

Commit 63f53dea0c9866e9 ("mm: warn about allocations which stall for
too long") was a great step for reducing possibility of silent hang up
problem caused by memory allocation stalls [1]. However, there are
reports of long stalls (e.g. [2] is over 30 minutes!) and lockups (e.g.
[3] is an "unable to invoke the OOM killer due to !__GFP_FS allocation"
lockup problem) where this patch is more useful than that commit, for
this patch can report possibly related tasks even if allocating tasks
are unexpectedly blocked for so long. Regarding premature OOM killer
invocation, tracepoints which can accumulate samples in short interval
would be useful. But regarding too late to report allocation stalls,
this patch which can capture all tasks (for reporting overall situation)
in longer interval and act as a trigger (for accumulating short interval
samples) would be useful.

Thanks to the OOM reaper which can guarantee forward progress (by selecting
next OOM victim) as long as the OOM killer can be invoked, we can start
testing low memory situations which are previously too difficult to test.
And we are now aware that there are still corner cases remaining where
the system hangs without invoking the OOM killer.

This patch is aimed for help bisecting whether unexpected hung cases are
related to memory allocation. By merging this patch (and enabling this
watchdog in enterprise systems via kernels supported by distributors),
we can identify patterns/cases of problems (if related to memory
allocation) and improve quality of Linux kernels by fixing problems
related to memory allocation.

As a nature of hang up problems caused by memory allocation, it is very
hard for administrators to collect information for analysis. As a result,
such problems are left unrecognized/unsolved at the support center, and
are seldom reported to distributors/developers in order to ask for fixes.
Therefore, nobody can prove that this patch will not find any problems
which occur in production systems. By merging this patch, we can start
focusing on real problems which occurred in production systems.

This patch remained out-of-tree for a year and a half due to a question
whether amount of changes, runtime cost and maintenance burden caused
by this patch can be justified. But after all there is no real objection.

  Regarding amount of changes, I consider it is needed for making the
  watchdog safe/robust (e.g. no duplicated/skipped reports and no lockup
  warnings even if hundreds of threads entered into direct reclaim for
  memory allocation) and useful (e.g. trigger additional actions only when
  needed).

  Regarding runtime cost of allocating threads, this watchdog involves
  only slowpath where __GFP_DIRECT_RECLAIM is evaluated (in other words,
  direct reclaim for memory allocation is needed). Therefore, systems
  with adequate memory pressure will not notice.
  Regarding runtime cost of the watchdog kernel thread side, I tried to
  minimize it by checking per CPU in-flight counters before traversing
  the tasklist.

  Regarding maintenance burden, I consider this patch is least invasive
  because it does not make __GFP_NOWARN flag's semantic confusing while
  providing administrators some hints [4]. Also, this patch will remain
  useful because we might overlook something that can cause infinite
  loop (or significant delay) in future changes, and we can remove this
  patch when we achieve safe and robust memory management subsystem.

Changes from v1 [5]:

  (1) Use per a "struct task_struct" variables. This allows vmcore to
  remember information about last memory allocation request, which
  is useful for understanding last-minute behavior of the kernel.

  (2) Report using accurate timeout. This increases possibility of
  successfully reporting before watchdog timers reset the machine.

  (3) Show memory information (SysRq-m). This makes it easier to know
  the reason of stalling.

  (4) Show both $state_of_allocation and $state_of_task in the same
  line. This makes it easier to grep the output.

  (5) Minimize duration of spinlock held by the kernel thread.

Changes from v2 [6]:

  (1) Print sequence number. This makes it easier to know whether
  memory allocation is succeeding (looks like a livelock but making
  forward progress) or not.

  (2) Replace spinlock with cheaper seqlock_t like sequence number based
  method. The caller no longer contend on lock, 

[PATCH v9] mm: Add memory allocation watchdog kernel thread.

2017-05-20 Thread Tetsuo Handa
This patch adds a watchdog which periodically reports number of memory
allocating tasks, dying tasks and OOM victim tasks when some task is
spending too long time inside __alloc_pages_slowpath(). This patch also
serves as a hook for obtaining additional information using SystemTap
(e.g. examine other variables using printk(), capture a crash dump by
calling panic()) by triggering a callback only when a stall is detected.
Ability to take administrator-controlled actions based on some threshold
is a big advantage gained by introducing a state tracking.

Commit 63f53dea0c9866e9 ("mm: warn about allocations which stall for
too long") was a great step for reducing possibility of silent hang up
problem caused by memory allocation stalls [1]. However, there are
reports of long stalls (e.g. [2] is over 30 minutes!) and lockups (e.g.
[3] is an "unable to invoke the OOM killer due to !__GFP_FS allocation"
lockup problem) where this patch is more useful than that commit, for
this patch can report possibly related tasks even if allocating tasks
are unexpectedly blocked for so long. Regarding premature OOM killer
invocation, tracepoints which can accumulate samples in short interval
would be useful. But regarding too late to report allocation stalls,
this patch which can capture all tasks (for reporting overall situation)
in longer interval and act as a trigger (for accumulating short interval
samples) would be useful.

Thanks to the OOM reaper which can guarantee forward progress (by selecting
next OOM victim) as long as the OOM killer can be invoked, we can start
testing low memory situations which are previously too difficult to test.
And we are now aware that there are still corner cases remaining where
the system hangs without invoking the OOM killer.

This patch is aimed for help bisecting whether unexpected hung cases are
related to memory allocation. By merging this patch (and enabling this
watchdog in enterprise systems via kernels supported by distributors),
we can identify patterns/cases of problems (if related to memory
allocation) and improve quality of Linux kernels by fixing problems
related to memory allocation.

As a nature of hang up problems caused by memory allocation, it is very
hard for administrators to collect information for analysis. As a result,
such problems are left unrecognized/unsolved at the support center, and
are seldom reported to distributors/developers in order to ask for fixes.
Therefore, nobody can prove that this patch will not find any problems
which occur in production systems. By merging this patch, we can start
focusing on real problems which occurred in production systems.

This patch remained out-of-tree for a year and a half due to a question
whether amount of changes, runtime cost and maintenance burden caused
by this patch can be justified. But after all there is no real objection.

  Regarding amount of changes, I consider it is needed for making the
  watchdog safe/robust (e.g. no duplicated/skipped reports and no lockup
  warnings even if hundreds of threads entered into direct reclaim for
  memory allocation) and useful (e.g. trigger additional actions only when
  needed).

  Regarding runtime cost of allocating threads, this watchdog involves
  only slowpath where __GFP_DIRECT_RECLAIM is evaluated (in other words,
  direct reclaim for memory allocation is needed). Therefore, systems
  with adequate memory pressure will not notice.
  Regarding runtime cost of the watchdog kernel thread side, I tried to
  minimize it by checking per CPU in-flight counters before traversing
  the tasklist.

  Regarding maintenance burden, I consider this patch is least invasive
  because it does not make __GFP_NOWARN flag's semantic confusing while
  providing administrators some hints [4]. Also, this patch will remain
  useful because we might overlook something that can cause infinite
  loop (or significant delay) in future changes, and we can remove this
  patch when we achieve safe and robust memory management subsystem.

Changes from v1 [5]:

  (1) Use per a "struct task_struct" variables. This allows vmcore to
  remember information about last memory allocation request, which
  is useful for understanding last-minute behavior of the kernel.

  (2) Report using accurate timeout. This increases possibility of
  successfully reporting before watchdog timers reset the machine.

  (3) Show memory information (SysRq-m). This makes it easier to know
  the reason of stalling.

  (4) Show both $state_of_allocation and $state_of_task in the same
  line. This makes it easier to grep the output.

  (5) Minimize duration of spinlock held by the kernel thread.

Changes from v2 [6]:

  (1) Print sequence number. This makes it easier to know whether
  memory allocation is succeeding (looks like a livelock but making
  forward progress) or not.

  (2) Replace spinlock with cheaper seqlock_t like sequence number based
  method. The caller no longer contend on lock, 

[PATCH -next] drm/vgem: Fix return value check in vgem_init()

2017-05-20 Thread Wei Yongjun
From: Wei Yongjun 

In case of error, the function platform_device_register_simple() returns
ERR_PTR() and never returns NULL. The NULL test in the return value
check should be replaced with IS_ERR().

Fixes: 315f0242aa2b ("drm/vgem: Convert to a struct drm_device subclass")
Signed-off-by: Wei Yongjun 
---
 drivers/gpu/drm/vgem/vgem_drv.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
index 54ec94c..18f401b 100644
--- a/drivers/gpu/drm/vgem/vgem_drv.c
+++ b/drivers/gpu/drm/vgem/vgem_drv.c
@@ -438,8 +438,8 @@ static int __init vgem_init(void)
 
vgem_device->platform =
platform_device_register_simple("vgem", -1, NULL, 0);
-   if (!vgem_device->platform) {
-   ret = -ENODEV;
+   if (IS_ERR(vgem_device->platform)) {
+   ret = PTR_ERR(vgem_device->platform);
goto out_fini;
}



[PATCH -next] drm/vgem: Fix return value check in vgem_init()

2017-05-20 Thread Wei Yongjun
From: Wei Yongjun 

In case of error, the function platform_device_register_simple() returns
ERR_PTR() and never returns NULL. The NULL test in the return value
check should be replaced with IS_ERR().

Fixes: 315f0242aa2b ("drm/vgem: Convert to a struct drm_device subclass")
Signed-off-by: Wei Yongjun 
---
 drivers/gpu/drm/vgem/vgem_drv.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
index 54ec94c..18f401b 100644
--- a/drivers/gpu/drm/vgem/vgem_drv.c
+++ b/drivers/gpu/drm/vgem/vgem_drv.c
@@ -438,8 +438,8 @@ static int __init vgem_init(void)
 
vgem_device->platform =
platform_device_register_simple("vgem", -1, NULL, 0);
-   if (!vgem_device->platform) {
-   ret = -ENODEV;
+   if (IS_ERR(vgem_device->platform)) {
+   ret = PTR_ERR(vgem_device->platform);
goto out_fini;
}



[PATCH -next] drm/pl111: Fix return value check in pl111_amba_probe()

2017-05-20 Thread Wei Yongjun
From: Wei Yongjun 

In case of error, the function devm_ioremap_resource() returns ERR_PTR()
and never returns NULL. The NULL test in the return value check should
be replaced with IS_ERR().

Fixes: bed41005e617 ("drm/pl111: Initial drm/kms driver for pl111")
Signed-off-by: Wei Yongjun 
---
 drivers/gpu/drm/pl111/pl111_drv.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/pl111/pl111_drv.c 
b/drivers/gpu/drm/pl111/pl111_drv.c
index 936403f..c6b93ff 100644
--- a/drivers/gpu/drm/pl111/pl111_drv.c
+++ b/drivers/gpu/drm/pl111/pl111_drv.c
@@ -203,9 +203,9 @@ static int pl111_amba_probe(struct amba_device *amba_dev,
}
 
priv->regs = devm_ioremap_resource(dev, _dev->res);
-   if (!priv->regs) {
+   if (IS_ERR(priv->regs)) {
dev_err(dev, "%s failed mmio\n", __func__);
-   return -EINVAL;
+   return PTR_ERR(priv->regs);
}
 
/* turn off interrupts before requesting the irq */



[PATCH -next] drm/pl111: Fix return value check in pl111_amba_probe()

2017-05-20 Thread Wei Yongjun
From: Wei Yongjun 

In case of error, the function devm_ioremap_resource() returns ERR_PTR()
and never returns NULL. The NULL test in the return value check should
be replaced with IS_ERR().

Fixes: bed41005e617 ("drm/pl111: Initial drm/kms driver for pl111")
Signed-off-by: Wei Yongjun 
---
 drivers/gpu/drm/pl111/pl111_drv.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/pl111/pl111_drv.c 
b/drivers/gpu/drm/pl111/pl111_drv.c
index 936403f..c6b93ff 100644
--- a/drivers/gpu/drm/pl111/pl111_drv.c
+++ b/drivers/gpu/drm/pl111/pl111_drv.c
@@ -203,9 +203,9 @@ static int pl111_amba_probe(struct amba_device *amba_dev,
}
 
priv->regs = devm_ioremap_resource(dev, _dev->res);
-   if (!priv->regs) {
+   if (IS_ERR(priv->regs)) {
dev_err(dev, "%s failed mmio\n", __func__);
-   return -EINVAL;
+   return PTR_ERR(priv->regs);
}
 
/* turn off interrupts before requesting the irq */



[PATCH] goldfish_pipe: use GFP_ATOMIC under spin lock

2017-05-20 Thread Wei Yongjun
From: Wei Yongjun 

The function get_free_pipe_id_locked() is called from
goldfish_pipe_open() with a lock is held, so we should
use GFP_ATOMIC instead of GFP_KERNEL.

Signed-off-by: Wei Yongjun 
---
 drivers/platform/goldfish/goldfish_pipe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/platform/goldfish/goldfish_pipe.c 
b/drivers/platform/goldfish/goldfish_pipe.c
index 2de1e60..5f36721 100644
--- a/drivers/platform/goldfish/goldfish_pipe.c
+++ b/drivers/platform/goldfish/goldfish_pipe.c
@@ -704,7 +704,7 @@ static int get_free_pipe_id_locked(struct goldfish_pipe_dev 
*dev)
/* Reallocate the array */
u32 new_capacity = 2 * dev->pipes_capacity;
struct goldfish_pipe **pipes =
-   kcalloc(new_capacity, sizeof(*pipes), GFP_KERNEL);
+   kcalloc(new_capacity, sizeof(*pipes), GFP_ATOMIC);
if (!pipes)
return -ENOMEM;
memcpy(pipes, dev->pipes, sizeof(*pipes) * dev->pipes_capacity);



[PATCH] goldfish_pipe: use GFP_ATOMIC under spin lock

2017-05-20 Thread Wei Yongjun
From: Wei Yongjun 

The function get_free_pipe_id_locked() is called from
goldfish_pipe_open() with a lock is held, so we should
use GFP_ATOMIC instead of GFP_KERNEL.

Signed-off-by: Wei Yongjun 
---
 drivers/platform/goldfish/goldfish_pipe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/platform/goldfish/goldfish_pipe.c 
b/drivers/platform/goldfish/goldfish_pipe.c
index 2de1e60..5f36721 100644
--- a/drivers/platform/goldfish/goldfish_pipe.c
+++ b/drivers/platform/goldfish/goldfish_pipe.c
@@ -704,7 +704,7 @@ static int get_free_pipe_id_locked(struct goldfish_pipe_dev 
*dev)
/* Reallocate the array */
u32 new_capacity = 2 * dev->pipes_capacity;
struct goldfish_pipe **pipes =
-   kcalloc(new_capacity, sizeof(*pipes), GFP_KERNEL);
+   kcalloc(new_capacity, sizeof(*pipes), GFP_ATOMIC);
if (!pipes)
return -ENOMEM;
memcpy(pipes, dev->pipes, sizeof(*pipes) * dev->pipes_capacity);



Re: [PATCH] lpfc: nvmet_fc: fix format string

2017-05-20 Thread Joe Perches
On Sat, 2017-05-20 at 21:10 +0200, Arnd Bergmann wrote:
> On Sat, May 20, 2017 at 12:28 PM, Joe Perches  wrote:
> > On Fri, 2017-05-19 at 10:04 +0200, Arnd Bergmann wrote:
> > > The lpfc_nvmeio_data() tracing helper always takes a format string and
> > > three additional arguments.
> > 
> > No it doesn't.  It takes a format and arguments.
> > 
> > I don't disagree with the patch, just the characterization
> > of the lpfc_mvmeio_data call in the commit message.
> 
> I think my description is correct, it's just not obvious from
> reading the code until you also look at the lpfc_debugfs_nvme_trc
> prototype:

OK, but more that's a mismatch between a function and its
arguments and a different called function within it.



Re: [PATCH] lpfc: nvmet_fc: fix format string

2017-05-20 Thread Joe Perches
On Sat, 2017-05-20 at 21:10 +0200, Arnd Bergmann wrote:
> On Sat, May 20, 2017 at 12:28 PM, Joe Perches  wrote:
> > On Fri, 2017-05-19 at 10:04 +0200, Arnd Bergmann wrote:
> > > The lpfc_nvmeio_data() tracing helper always takes a format string and
> > > three additional arguments.
> > 
> > No it doesn't.  It takes a format and arguments.
> > 
> > I don't disagree with the patch, just the characterization
> > of the lpfc_mvmeio_data call in the commit message.
> 
> I think my description is correct, it's just not obvious from
> reading the code until you also look at the lpfc_debugfs_nvme_trc
> prototype:

OK, but more that's a mismatch between a function and its
arguments and a different called function within it.



linux-next 20170519 - semaphores broken

2017-05-20 Thread valdis . kletnieks
Seeing problems with programs that use semaphores.  The one
that I'm getting bit by is jackd.  strace says:

getuid()= 967
semget(0x282929, 0, 000)= 229376
semop(229376, [{0, -1, SEM_UNDO}], 1)   = -1 EIDRM (Identifier removed)
write(2, "JACK semaphore error: semop (Ide"..., 49JACK semaphore error: semop 
(Identifier removed)
) = 49

Bisects down to this commit, and reverting it from 20170519 makes things work
again.  No idea why this causes indigestion, there's probably something subtly
wrong here

commit 337f43326737b5eb28eb13f43c27a5788da0f913
Author: Manfred Spraul 
Date:   Fri May 19 07:39:23 2017 +1000

ipc: merge ipc_rcu and kern_ipc_perm

ipc has two management structures that exist for every id:
- struct kern_ipc_perm, it contains e.g. the permissions.
- struct ipc_rcu, it contains the rcu head for rcu handling and
  the refcount.







pgppO6ZAvw10v.pgp
Description: PGP signature


linux-next 20170519 - semaphores broken

2017-05-20 Thread valdis . kletnieks
Seeing problems with programs that use semaphores.  The one
that I'm getting bit by is jackd.  strace says:

getuid()= 967
semget(0x282929, 0, 000)= 229376
semop(229376, [{0, -1, SEM_UNDO}], 1)   = -1 EIDRM (Identifier removed)
write(2, "JACK semaphore error: semop (Ide"..., 49JACK semaphore error: semop 
(Identifier removed)
) = 49

Bisects down to this commit, and reverting it from 20170519 makes things work
again.  No idea why this causes indigestion, there's probably something subtly
wrong here

commit 337f43326737b5eb28eb13f43c27a5788da0f913
Author: Manfred Spraul 
Date:   Fri May 19 07:39:23 2017 +1000

ipc: merge ipc_rcu and kern_ipc_perm

ipc has two management structures that exist for every id:
- struct kern_ipc_perm, it contains e.g. the permissions.
- struct ipc_rcu, it contains the rcu head for rcu handling and
  the refcount.







pgppO6ZAvw10v.pgp
Description: PGP signature


[PATCH] x86: fix reference to lockup watchdog documentation

2017-05-20 Thread Benjamin Peterson
Fixes: 9919cba7ff71147803c988521cc1ceb80e7f0f6d ("watchdog: Update 
documentation")
Signed-off-by: Benjamin Peterson 
---
 arch/x86/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index cd18994a9555..4ccfacc7232a 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -360,7 +360,7 @@ config SMP
  Management" code will be disabled if you say Y here.
 
  See also ,
-  and the SMP-HOWTO available at
+  and the SMP-HOWTO available 
at
  .
 
  If you don't know what to do here, say N.
-- 
2.11.0



[PATCH] x86: fix reference to lockup watchdog documentation

2017-05-20 Thread Benjamin Peterson
Fixes: 9919cba7ff71147803c988521cc1ceb80e7f0f6d ("watchdog: Update 
documentation")
Signed-off-by: Benjamin Peterson 
---
 arch/x86/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index cd18994a9555..4ccfacc7232a 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -360,7 +360,7 @@ config SMP
  Management" code will be disabled if you say Y here.
 
  See also ,
-  and the SMP-HOWTO available at
+  and the SMP-HOWTO available 
at
  .
 
  If you don't know what to do here, say N.
-- 
2.11.0



Re: [PATCH] PCI: Make SR-IOV capable GPU working on the SR-IOV incapable platform

2017-05-20 Thread Alexander Duyck
I'd say the common solution is probably the parameter that allows the
user to disable SR-IOV in the kernel on boot.

The problem with trying to do this automatically is that there are too
many scenarios to know what it was that the BIOS was trying to do.

Another alternative would be to look at providing a means of changing
how the SR-IOV code tries to fix broken setups. Right now it defaults
to trying to allocate the data as it assumes it is going to enable
SR-IOV on every device that has SR-IOV support. An alternative might
be to make the kernel option support multiple options. You could have
it do nosriov as one option, and another option that only enables
SR-IOV on devices that are fully configured and disabled it otherwise,
and then our current default option which is to try enabling SR-IOV on
any device that could support it. Then you could probably also make
the default something you could have as a kernel configuration options
so you could build a kernel that defaults to the middle option that
leaves SR-IOV devices correctly configured enabled, and disables it
otherwise.

- Alex

On Sat, May 20, 2017 at 7:29 AM, Zytaruk, Kelly  wrote:
> Collins,
>
> Okay, good to know.
> Is there a common solution that can handle all cases?
>
> Thanks,
> Kelly
>
>>-Original Message-
>>From: Cheng, Collins
>>Sent: Saturday, May 20, 2017 6:38 AM
>>To: Zytaruk, Kelly; Alexander Duyck; Alex Williamson
>>Cc: Bjorn Helgaas; linux-...@vger.kernel.org; linux-kernel@vger.kernel.org;
>>Deucher, Alexander; Yinghai Lu
>>Subject: RE: [PATCH] PCI: Make SR-IOV capable GPU working on the SR-IOV
>>incapable platform
>>
>>Hi Kelly,
>>
>>This issue also happens in "not SR-IOV capable" SBIOS. It seems some "not 
>>SR-IOV
>>capable" SBIOS will directly report error in system BIOS boot stage and 
>>doesn't
>>boot to OS. But other "not SR-IOV capable" SBIOS would not report error and
>>boot to Linux.
>>
>>-Collins Cheng
>>
>>
>>-Original Message-
>>From: Zytaruk, Kelly
>>Sent: Saturday, May 20, 2017 6:28 PM
>>To: Cheng, Collins ; Alexander Duyck
>>; Alex Williamson 
>>Cc: Bjorn Helgaas ; linux-...@vger.kernel.org; linux-
>>ker...@vger.kernel.org; Deucher, Alexander ;
>>Yinghai Lu 
>>Subject: RE: [PATCH] PCI: Make SR-IOV capable GPU working on the SR-IOV
>>incapable platform
>>
>>
>>
>>>-Original Message-
>>>From: Cheng, Collins
>>>Sent: Saturday, May 20, 2017 12:53 AM
>>>To: Alexander Duyck; Alex Williamson
>>>Cc: Bjorn Helgaas; linux-...@vger.kernel.org;
>>>linux-kernel@vger.kernel.org; Deucher, Alexander; Zytaruk, Kelly;
>>>Yinghai Lu
>>>Subject: RE: [PATCH] PCI: Make SR-IOV capable GPU working on the SR-IOV
>>>incapable platform
>>>
>>>Hi Alex,
>>>
>>>Yes, I hope kernel can disable SR-IOV and related VF resource
>>>allocation if the system BIOS is not SR-IOV capable.
>>>
>>>Adding the parameter "pci=nosriov" sounds a doable solution, but it
>>>would need user to add this parameter manually, right? I think an
>>>automatic detection would be better. My patch is trying to auto detect and
>>bypass VF resource allocation.
>>>
>>>
>>>-Collins Cheng
>>>
>>
>>Collins, be careful about this.  I don't think that this is what we want.  If 
>>you add
>>"pci=nosriov" then you are globally disabling SRIOV for all devices.  This is 
>>not the
>>solution that we are looking for.
>>Remember that there are 3 types of SBIOS; "not SR-IOV capable", "SR-IOV
>>capable but does not support large resources", "Complete SR-IOV support".
>>
>>The problem is that we are trying to find a fix for "broken" SBIOS that does
>>support SR-IOV but does not support the full SR-IOV capabilities that devices 
>>with
>>large resources require.
>>
>>Thanks,
>>Kelly
>>
>>>
>>>-Original Message-
>>>From: Alexander Duyck [mailto:alexander.du...@gmail.com]
>>>Sent: Friday, May 19, 2017 11:44 PM
>>>To: Alex Williamson 
>>>Cc: Cheng, Collins ; Bjorn Helgaas
>>>; linux-...@vger.kernel.org; linux-
>>>ker...@vger.kernel.org; Deucher, Alexander ;
>>>Zytaruk, Kelly ; Yinghai Lu 
>>>Subject: Re: [PATCH] PCI: Make SR-IOV capable GPU working on the SR-IOV
>>>incapable platform
>>>
>>>On Mon, May 15, 2017 at 10:53 AM, Alex Williamson
>>> wrote:
 On Mon, 15 May 2017 08:19:28 +
 "Cheng, Collins"  wrote:

> Hi Williamson,
>
> We cannot assume BIOS supports SR-IOV, actually only newer server
>>>motherboard BIOS supports SR-IOV. Normal desktop motherboard BIOS or
>>>older server motherboard BIOS doesn't support SR-IOV. This issue would
>>>happen if an user plugs our AMD SR-IOV capable GPU card to a normal desktop
>>motherboard.

 Servers should be supporting SR-IOV for a 

Re: [PATCH] PCI: Make SR-IOV capable GPU working on the SR-IOV incapable platform

2017-05-20 Thread Alexander Duyck
I'd say the common solution is probably the parameter that allows the
user to disable SR-IOV in the kernel on boot.

The problem with trying to do this automatically is that there are too
many scenarios to know what it was that the BIOS was trying to do.

Another alternative would be to look at providing a means of changing
how the SR-IOV code tries to fix broken setups. Right now it defaults
to trying to allocate the data as it assumes it is going to enable
SR-IOV on every device that has SR-IOV support. An alternative might
be to make the kernel option support multiple options. You could have
it do nosriov as one option, and another option that only enables
SR-IOV on devices that are fully configured and disabled it otherwise,
and then our current default option which is to try enabling SR-IOV on
any device that could support it. Then you could probably also make
the default something you could have as a kernel configuration options
so you could build a kernel that defaults to the middle option that
leaves SR-IOV devices correctly configured enabled, and disables it
otherwise.

- Alex

On Sat, May 20, 2017 at 7:29 AM, Zytaruk, Kelly  wrote:
> Collins,
>
> Okay, good to know.
> Is there a common solution that can handle all cases?
>
> Thanks,
> Kelly
>
>>-Original Message-
>>From: Cheng, Collins
>>Sent: Saturday, May 20, 2017 6:38 AM
>>To: Zytaruk, Kelly; Alexander Duyck; Alex Williamson
>>Cc: Bjorn Helgaas; linux-...@vger.kernel.org; linux-kernel@vger.kernel.org;
>>Deucher, Alexander; Yinghai Lu
>>Subject: RE: [PATCH] PCI: Make SR-IOV capable GPU working on the SR-IOV
>>incapable platform
>>
>>Hi Kelly,
>>
>>This issue also happens in "not SR-IOV capable" SBIOS. It seems some "not 
>>SR-IOV
>>capable" SBIOS will directly report error in system BIOS boot stage and 
>>doesn't
>>boot to OS. But other "not SR-IOV capable" SBIOS would not report error and
>>boot to Linux.
>>
>>-Collins Cheng
>>
>>
>>-Original Message-
>>From: Zytaruk, Kelly
>>Sent: Saturday, May 20, 2017 6:28 PM
>>To: Cheng, Collins ; Alexander Duyck
>>; Alex Williamson 
>>Cc: Bjorn Helgaas ; linux-...@vger.kernel.org; linux-
>>ker...@vger.kernel.org; Deucher, Alexander ;
>>Yinghai Lu 
>>Subject: RE: [PATCH] PCI: Make SR-IOV capable GPU working on the SR-IOV
>>incapable platform
>>
>>
>>
>>>-Original Message-
>>>From: Cheng, Collins
>>>Sent: Saturday, May 20, 2017 12:53 AM
>>>To: Alexander Duyck; Alex Williamson
>>>Cc: Bjorn Helgaas; linux-...@vger.kernel.org;
>>>linux-kernel@vger.kernel.org; Deucher, Alexander; Zytaruk, Kelly;
>>>Yinghai Lu
>>>Subject: RE: [PATCH] PCI: Make SR-IOV capable GPU working on the SR-IOV
>>>incapable platform
>>>
>>>Hi Alex,
>>>
>>>Yes, I hope kernel can disable SR-IOV and related VF resource
>>>allocation if the system BIOS is not SR-IOV capable.
>>>
>>>Adding the parameter "pci=nosriov" sounds a doable solution, but it
>>>would need user to add this parameter manually, right? I think an
>>>automatic detection would be better. My patch is trying to auto detect and
>>bypass VF resource allocation.
>>>
>>>
>>>-Collins Cheng
>>>
>>
>>Collins, be careful about this.  I don't think that this is what we want.  If 
>>you add
>>"pci=nosriov" then you are globally disabling SRIOV for all devices.  This is 
>>not the
>>solution that we are looking for.
>>Remember that there are 3 types of SBIOS; "not SR-IOV capable", "SR-IOV
>>capable but does not support large resources", "Complete SR-IOV support".
>>
>>The problem is that we are trying to find a fix for "broken" SBIOS that does
>>support SR-IOV but does not support the full SR-IOV capabilities that devices 
>>with
>>large resources require.
>>
>>Thanks,
>>Kelly
>>
>>>
>>>-Original Message-
>>>From: Alexander Duyck [mailto:alexander.du...@gmail.com]
>>>Sent: Friday, May 19, 2017 11:44 PM
>>>To: Alex Williamson 
>>>Cc: Cheng, Collins ; Bjorn Helgaas
>>>; linux-...@vger.kernel.org; linux-
>>>ker...@vger.kernel.org; Deucher, Alexander ;
>>>Zytaruk, Kelly ; Yinghai Lu 
>>>Subject: Re: [PATCH] PCI: Make SR-IOV capable GPU working on the SR-IOV
>>>incapable platform
>>>
>>>On Mon, May 15, 2017 at 10:53 AM, Alex Williamson
>>> wrote:
 On Mon, 15 May 2017 08:19:28 +
 "Cheng, Collins"  wrote:

> Hi Williamson,
>
> We cannot assume BIOS supports SR-IOV, actually only newer server
>>>motherboard BIOS supports SR-IOV. Normal desktop motherboard BIOS or
>>>older server motherboard BIOS doesn't support SR-IOV. This issue would
>>>happen if an user plugs our AMD SR-IOV capable GPU card to a normal desktop
>>motherboard.

 Servers should be supporting SR-IOV for a long time now.  What really
 is there to a BIOS supporting SR-IOV anyway, it's simply reserving
 sufficient bus number and MMIO resources such that we can enable the
 VFs.  This process isn't exclusively reserved for the BIOS.  Some
 platforms may choose to only initialize boot devices, leaving the
 rest for the OS to program.  The initial 

Re: [PATCH] drm: remove NULL pointer check for clk_disable_unprepare

2017-05-20 Thread Rob Clark
On Sat, May 20, 2017 at 3:04 PM, Masahiro Yamada
 wrote:
> 2017-05-21 2:58 GMT+09:00 Masahiro Yamada :
>> After long term efforts of fixing non-common clock implementations,
>> clk_disable() is a no-op for a NULL pointer input, and this is now
>> tree-wide consistent.
>>
>> All clock consumers can safely call clk_disable(_unprepare) without
>> NULL pointer check.
>>
>> Signed-off-by: Masahiro Yamada 
>
>
> Sorry, I retract this patch.
>
> Krzysztof pointed out
> cleanups only for clk_disable_unprepare() will lose the code symmetry.
>
> NULL pointer checks for clk_prepare_enable() should be
> removed to keep the code symmetrical.
>
> This is possible for common-clock framework because
> clk_prepare_enable() is also a no-op for a NULL clk input.
> But it is not necessarily true for non-common clock implementations.

At least for drm/msm, upstream I think we can assume CCF.. I still
need to check for possible downstream kernels to which someone might
want to backport drm/msm.

It might be an idea to split this up per-driver, since at least for
some drivers it might be safe to assume CCF (or non-CCF clk driver
that behaves the same)

BR,
-R

>
> --
> Best Regards
> Masahiro Yamada


Re: [PATCH] drm: remove NULL pointer check for clk_disable_unprepare

2017-05-20 Thread Rob Clark
On Sat, May 20, 2017 at 3:04 PM, Masahiro Yamada
 wrote:
> 2017-05-21 2:58 GMT+09:00 Masahiro Yamada :
>> After long term efforts of fixing non-common clock implementations,
>> clk_disable() is a no-op for a NULL pointer input, and this is now
>> tree-wide consistent.
>>
>> All clock consumers can safely call clk_disable(_unprepare) without
>> NULL pointer check.
>>
>> Signed-off-by: Masahiro Yamada 
>
>
> Sorry, I retract this patch.
>
> Krzysztof pointed out
> cleanups only for clk_disable_unprepare() will lose the code symmetry.
>
> NULL pointer checks for clk_prepare_enable() should be
> removed to keep the code symmetrical.
>
> This is possible for common-clock framework because
> clk_prepare_enable() is also a no-op for a NULL clk input.
> But it is not necessarily true for non-common clock implementations.

At least for drm/msm, upstream I think we can assume CCF.. I still
need to check for possible downstream kernels to which someone might
want to backport drm/msm.

It might be an idea to split this up per-driver, since at least for
some drivers it might be safe to assume CCF (or non-CCF clk driver
that behaves the same)

BR,
-R

>
> --
> Best Regards
> Masahiro Yamada


Re: [PATCH] Make x86 use $TARGET-readelf like all the other arches.

2017-05-20 Thread Kees Cook
On Sat, May 20, 2017 at 1:03 PM, Rob Landley  wrote:
> From: Rob Landley 
>
> My cross-compile environment doesn't provide an unprefixed
> readelf in the $PATH, which works fine on every target but x86,
> where you get a bunch of "/bin/sh: 1: readelf: not found"
> messages (but the result still works anyway).
>
> Signed-off-by: Rob Landley 

Ooops, thanks for the catch!

Acked-by: Kees Cook 

-Kees

> ---
>
>  arch/x86/boot/compressed/Makefile |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/boot/compressed/Makefile 
> b/arch/x86/boot/compressed/Makefile
> index 44163e8..2c860ad 100644
> --- a/arch/x86/boot/compressed/Makefile
> +++ b/arch/x86/boot/compressed/Makefile
> @@ -94,7 +94,7 @@ vmlinux-objs-$(CONFIG_EFI_MIXED) += 
> $(obj)/efi_thunk_$(BITS).o
>  quiet_cmd_check_data_rel = DATAREL $@
>  define cmd_check_data_rel
> for obj in $(filter %.o,$^); do \
> -   readelf -S $$obj | grep -qF .rel.local && { \
> +   ${CROSS_COMPILE}readelf -S $$obj | grep -qF .rel.local && { \
> echo "error: $$obj has data relocations!" >&2; \
> exit 1; \
> } || true; \



-- 
Kees Cook
Pixel Security


Re: [PATCH] Make x86 use $TARGET-readelf like all the other arches.

2017-05-20 Thread Kees Cook
On Sat, May 20, 2017 at 1:03 PM, Rob Landley  wrote:
> From: Rob Landley 
>
> My cross-compile environment doesn't provide an unprefixed
> readelf in the $PATH, which works fine on every target but x86,
> where you get a bunch of "/bin/sh: 1: readelf: not found"
> messages (but the result still works anyway).
>
> Signed-off-by: Rob Landley 

Ooops, thanks for the catch!

Acked-by: Kees Cook 

-Kees

> ---
>
>  arch/x86/boot/compressed/Makefile |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/boot/compressed/Makefile 
> b/arch/x86/boot/compressed/Makefile
> index 44163e8..2c860ad 100644
> --- a/arch/x86/boot/compressed/Makefile
> +++ b/arch/x86/boot/compressed/Makefile
> @@ -94,7 +94,7 @@ vmlinux-objs-$(CONFIG_EFI_MIXED) += 
> $(obj)/efi_thunk_$(BITS).o
>  quiet_cmd_check_data_rel = DATAREL $@
>  define cmd_check_data_rel
> for obj in $(filter %.o,$^); do \
> -   readelf -S $$obj | grep -qF .rel.local && { \
> +   ${CROSS_COMPILE}readelf -S $$obj | grep -qF .rel.local && { \
> echo "error: $$obj has data relocations!" >&2; \
> exit 1; \
> } || true; \



-- 
Kees Cook
Pixel Security


Re: [PATCH 7/7] DWARF: add the config option

2017-05-20 Thread Linus Torvalds
On Sat, May 20, 2017 at 4:00 PM, Linus Torvalds
 wrote:
>
> hjl already posted an example of the kinds of horrors glibc does to do
> things "right".

Side note: we'd hopefully/presumably never need anything _that_
disgusting for the kernel, so hjl's example is probably an extreme
one.

But even when we just did the pushq/popq_cfi macros etc to try to have
simple and reasonably legible annotations for the common cases, it got
pretty ugly.

It wasn't that extreme glibc kind of "50 lines of ugly for two
instructions of code", but it was pretty bad. And as far as I know we
never even tried to annotate places where we did "pushf/pop %reg" in
inline asm (for saving/restoring flags)

   Linus


Re: [PATCH 7/7] DWARF: add the config option

2017-05-20 Thread Linus Torvalds
On Sat, May 20, 2017 at 4:00 PM, Linus Torvalds
 wrote:
>
> hjl already posted an example of the kinds of horrors glibc does to do
> things "right".

Side note: we'd hopefully/presumably never need anything _that_
disgusting for the kernel, so hjl's example is probably an extreme
one.

But even when we just did the pushq/popq_cfi macros etc to try to have
simple and reasonably legible annotations for the common cases, it got
pretty ugly.

It wasn't that extreme glibc kind of "50 lines of ugly for two
instructions of code", but it was pretty bad. And as far as I know we
never even tried to annotate places where we did "pushf/pop %reg" in
inline asm (for saving/restoring flags)

   Linus


LOAN UPDATE

2017-05-20 Thread CHOICE LOAN LIMITED
Hello, Complement of the day to you,Please treat this with the utmost
confidentiality and Upon maturity, I sent a routine notification to
you days back and never receive a response from you...If you find
yourself able to work with me, contact for an update in regards to the
latest development about your pending loan in our custody if you
really want to get the loan..Please observe utmost confidentiality,
and be rest assured that this transaction would be most profitable for
you. This is to Remind you about your pending loan transfer,so you
should kindly get back to us so that we can proceed further with the
transfer okay.. Awaiting your response.


LOAN UPDATE

2017-05-20 Thread CHOICE LOAN LIMITED
Hello, Complement of the day to you,Please treat this with the utmost
confidentiality and Upon maturity, I sent a routine notification to
you days back and never receive a response from you...If you find
yourself able to work with me, contact for an update in regards to the
latest development about your pending loan in our custody if you
really want to get the loan..Please observe utmost confidentiality,
and be rest assured that this transaction would be most profitable for
you. This is to Remind you about your pending loan transfer,so you
should kindly get back to us so that we can proceed further with the
transfer okay.. Awaiting your response.


Re: [PATCH 7/7] DWARF: add the config option

2017-05-20 Thread Linus Torvalds
On Sat, May 20, 2017 at 2:56 PM, Andy Lutomirski  wrote:
> On Sat, May 20, 2017 at 1:16 PM, Linus Torvalds
>  wrote:
>>
>> The amount of unreadable crap and bugs it requires is not worth the
>> pain. Not for *any* amount of gain, and the gain here is basically
>> zero.
>
> But what if objtool autogenerated the annotations, perhaps with a tiny
> bit of help telling it "hardware frame goes here" or "pt_regs goes
> here"?

You snipped the next part of my email, where I said:

> The *only* acceptable model is automated tools (ie objtool). Don't
> even bother to try to go any other way. Because I will not accept that
> shit.

so yes, objtool parsing things on its own is acceptable (and it had
better not need any help - it already checks frame pointer data).

The CFI annotations needed in asm are horrendous. We used to have
them, and we didn't have even _remotely_ complete annotations and
despite that they were
 (a) wrong
 (b) incomplete
 (c) made the asm impossible to read and even worse to modify.

hjl already posted an example of the kinds of horrors glibc does to do
things "right". And those rabbit ears around "right" are there for a
reason. There's no way that is ever right - even if it gets the right
results, it's an unmaintainable piece of crap.

So no, we're never ever adding that CFI garbage back into the kernel.

A tool that can generate it is ok, but even then we should expect
inevitable bugs and not trust the end result blindly.

Because dwarf info is complex enough that other tools have gotten it
wrong many many times. Just google for "gcc bugzilla cfi" or go to the
gcc bugzilla and search for "DWARF" or whatever. It's not "oh, we once
had a bug". It's constant.

One of the reasons I like the idea of having objtool generate
something *simpler* than dwarf is that it not only is much easier to
parse, it has a much higher likelihood of not having some crazy bug.
If objtool mainly looks at the actual instructions, and perhaps uses
dwarf information as additional input and creates something much
simpler than dwarf, it might have a chance in hell of occasionally
even getting it right.

Because dwarf information is really really complicated. It's
complicated because it contains *way* more information than just how
to find the next stack frame.

I mean, it has basically a RPN interpreter built in, and that's the
_simple_ part.

  Linus


Re: [PATCH 7/7] DWARF: add the config option

2017-05-20 Thread Linus Torvalds
On Sat, May 20, 2017 at 2:56 PM, Andy Lutomirski  wrote:
> On Sat, May 20, 2017 at 1:16 PM, Linus Torvalds
>  wrote:
>>
>> The amount of unreadable crap and bugs it requires is not worth the
>> pain. Not for *any* amount of gain, and the gain here is basically
>> zero.
>
> But what if objtool autogenerated the annotations, perhaps with a tiny
> bit of help telling it "hardware frame goes here" or "pt_regs goes
> here"?

You snipped the next part of my email, where I said:

> The *only* acceptable model is automated tools (ie objtool). Don't
> even bother to try to go any other way. Because I will not accept that
> shit.

so yes, objtool parsing things on its own is acceptable (and it had
better not need any help - it already checks frame pointer data).

The CFI annotations needed in asm are horrendous. We used to have
them, and we didn't have even _remotely_ complete annotations and
despite that they were
 (a) wrong
 (b) incomplete
 (c) made the asm impossible to read and even worse to modify.

hjl already posted an example of the kinds of horrors glibc does to do
things "right". And those rabbit ears around "right" are there for a
reason. There's no way that is ever right - even if it gets the right
results, it's an unmaintainable piece of crap.

So no, we're never ever adding that CFI garbage back into the kernel.

A tool that can generate it is ok, but even then we should expect
inevitable bugs and not trust the end result blindly.

Because dwarf info is complex enough that other tools have gotten it
wrong many many times. Just google for "gcc bugzilla cfi" or go to the
gcc bugzilla and search for "DWARF" or whatever. It's not "oh, we once
had a bug". It's constant.

One of the reasons I like the idea of having objtool generate
something *simpler* than dwarf is that it not only is much easier to
parse, it has a much higher likelihood of not having some crazy bug.
If objtool mainly looks at the actual instructions, and perhaps uses
dwarf information as additional input and creates something much
simpler than dwarf, it might have a chance in hell of occasionally
even getting it right.

Because dwarf information is really really complicated. It's
complicated because it contains *way* more information than just how
to find the next stack frame.

I mean, it has basically a RPN interpreter built in, and that's the
_simple_ part.

  Linus


Re: [v3, 2/2] initramfs: Allow again choice of the embedded initram compression algorithm

2017-05-20 Thread Florian Fainelli
Hi Francisco, Nicholas,

Nicholas already fixed part of this commit, but there is more breakage,
see below:

On 09/27/2016 01:32 PM, klondike wrote:
> Choosing the appropriate compression option when using an embeded initramfs
> can result in significant size differences in the resulting data.
> 
> This is caused by avoiding double compression of the initramfs contents.
> For example on my tests, choosing CONFIG_INITRAMFS_COMPRESSION_NONE when
> compressing the kernel using XZ) results in up to 500KiB differences (9MiB to
>  8.5MiB) in the kernel size as the dictionary will not get polluted with
> uncomprensible data and may reuse kernel data too.
> 
> Despite embedding an uncompressed initramfs, a user may want to allow for a
> compressed extra initramfs to be passed using the rd system, for example to
> boot a recovery system. Commit 9ba4bcb645898d562498ea66a0df958ef0e7a68c
> ("initramfs: read CONFIG_RD_ variables for initramfs compression") broke
> that behavior by making the choice based on CONFIG_RD_* instead of adding
> CONFIG_INITRAMFS_COMPRESSION_LZ4. Saddly, CONFIG_RD_* is also used to
> choose the supported RD compression algorithms by the kernel and a user may
> want to suppport more than one.
> 
> This patch also reverses 3e4e0f0a8756dade3023d1f47d50fbced7749788
> ("initramfs: remove "compression mode" choice") restoring back the
> "compression mode" choice and includes the CONFIG_INITRAMFS_COMPRESSION_LZ4
> option which was never added.
> 
> As a result the following options are added or readed affecting the embedded
> initramfs compression:
> INITRAMFS_COMPRESSION_NONE Do no compression
> INITRAMFS_COMPRESSION_GZIP Compress using gzip
> INITRAMFS_COMPRESSION_BZIP2 Compress using bzip2
> INITRAMFS_COMPRESSION_LZMA Compress using lzma
> INITRAMFS_COMPRESSION_XZ Compress using xz
> INITRAMFS_COMPRESSION_LZO Compress using lzo
> INITRAMFS_COMPRESSION_LZ4 Compress using lz4
> 
> These depend on the corresponding CONFIG_RD_* option being set (except NONE
> which has no dependencies).
> 
> This patch depends on the previous one (the previous version didn't) to
> simplify the way in which the algorithm is chosen and keep backwards
> compatibility with the behaviour introduced by commit
>  9ba4bcb645898d562498ea66a0df958ef0e7a68c
> 
> Signed-off-by: Francisco Blas Izquierdo Riera (klondike) 
> 
> Cc: P J P 
> Cc: Paul Bolle 
> Cc: Andrew Morton 

Running a bisection against usr/ was not particularly convincing but
here is basically what I am observing which used to work just fine
before as of v4.9 and since I tracked it to this particular
commit/patch. Here is what my build system does:

- kernel is initially configured not to have an initramfs included
- build the user space root file system
- re-configure the kernel to have an initramfs included
(CONFIG_INITRAMFS_SOURCE="/path/to/romfs") and set relevant
CONFIG_INITRAMFS options, in my case, no compression
- kernel is re-built with these options -> kernel+initramfs image is copied
- kernel is re-built again without these options -> kernel image is copied

Now suppose you make changes to your root filesystem, like add/remove
applications, initramfs_data.cpio is now a stale file and go through the
same process again:

- build the kernel without an initramfs
- user space (re)build
- build the kernel with an initramfs

Building a kernel without an initramfs means setting this option:

CONFIG_INITRAMFS_SOURCE=""

whereas building a kernel with an initramfs means setting these options:

CONFIG_INITRAMFS_SOURCE="/home/fainelli/work/uclinux-rootfs/romfs
/home/fainelli/work/uclinux-rootfs/misc/initramfs.dev"
CONFIG_INITRAMFS_ROOT_UID=1000
CONFIG_INITRAMFS_ROOT_GID=1000
CONFIG_INITRAMFS_COMPRESSION_NONE=y
# CONFIG_INITRAMFS_COMPRESSION_GZIP is not set
# CONFIG_INITRAMFS_COMPRESSION_BZIP2 is not set
# CONFIG_INITRAMFS_COMPRESSION_LZMA is not set
# CONFIG_INITRAMFS_COMPRESSION_XZ is not set
# CONFIG_INITRAMFS_COMPRESSION_LZO is not set
# CONFIG_INITRAMFS_COMPRESSION_LZ4 is not set
CONFIG_INITRAMFS_COMPRESSION=""

Commit db2aa7fd15e857891cefbada8348c8d938c7a2bc ("initramfs: allow again
choice of the embedded initram compression algorithm") appears
problematic because CONFIG_INITRAMFS_COMPRESSION which is used to
determine the initramfs cpio extension/compression is a string, and due
to how Kconfig works, it will evaluate, in order, how to assign it.
Setting CONFIG_INITRAMFS_COMPRESSION_NONE with
CONFIG_INITRAMFS_SOURCE="" cannot possibly work (because of the depends
on INITRAMFS_SOURCE!="") yet we still manage to get it assigned to
something: ".gz" because CONFIG_RD_GZIP=y is set in my kernel, even when
there is no initramfs being built.

So we basically end-up generating two initramfs_data.cpio* files, one
without extension, and one with .gz. This causes usr/Makefile to track
usr/initramfs_data.cpio.gz, and not usr/initramfs_data.cpio, that is
also problematic after 

Re: [v3, 2/2] initramfs: Allow again choice of the embedded initram compression algorithm

2017-05-20 Thread Florian Fainelli
Hi Francisco, Nicholas,

Nicholas already fixed part of this commit, but there is more breakage,
see below:

On 09/27/2016 01:32 PM, klondike wrote:
> Choosing the appropriate compression option when using an embeded initramfs
> can result in significant size differences in the resulting data.
> 
> This is caused by avoiding double compression of the initramfs contents.
> For example on my tests, choosing CONFIG_INITRAMFS_COMPRESSION_NONE when
> compressing the kernel using XZ) results in up to 500KiB differences (9MiB to
>  8.5MiB) in the kernel size as the dictionary will not get polluted with
> uncomprensible data and may reuse kernel data too.
> 
> Despite embedding an uncompressed initramfs, a user may want to allow for a
> compressed extra initramfs to be passed using the rd system, for example to
> boot a recovery system. Commit 9ba4bcb645898d562498ea66a0df958ef0e7a68c
> ("initramfs: read CONFIG_RD_ variables for initramfs compression") broke
> that behavior by making the choice based on CONFIG_RD_* instead of adding
> CONFIG_INITRAMFS_COMPRESSION_LZ4. Saddly, CONFIG_RD_* is also used to
> choose the supported RD compression algorithms by the kernel and a user may
> want to suppport more than one.
> 
> This patch also reverses 3e4e0f0a8756dade3023d1f47d50fbced7749788
> ("initramfs: remove "compression mode" choice") restoring back the
> "compression mode" choice and includes the CONFIG_INITRAMFS_COMPRESSION_LZ4
> option which was never added.
> 
> As a result the following options are added or readed affecting the embedded
> initramfs compression:
> INITRAMFS_COMPRESSION_NONE Do no compression
> INITRAMFS_COMPRESSION_GZIP Compress using gzip
> INITRAMFS_COMPRESSION_BZIP2 Compress using bzip2
> INITRAMFS_COMPRESSION_LZMA Compress using lzma
> INITRAMFS_COMPRESSION_XZ Compress using xz
> INITRAMFS_COMPRESSION_LZO Compress using lzo
> INITRAMFS_COMPRESSION_LZ4 Compress using lz4
> 
> These depend on the corresponding CONFIG_RD_* option being set (except NONE
> which has no dependencies).
> 
> This patch depends on the previous one (the previous version didn't) to
> simplify the way in which the algorithm is chosen and keep backwards
> compatibility with the behaviour introduced by commit
>  9ba4bcb645898d562498ea66a0df958ef0e7a68c
> 
> Signed-off-by: Francisco Blas Izquierdo Riera (klondike) 
> 
> Cc: P J P 
> Cc: Paul Bolle 
> Cc: Andrew Morton 

Running a bisection against usr/ was not particularly convincing but
here is basically what I am observing which used to work just fine
before as of v4.9 and since I tracked it to this particular
commit/patch. Here is what my build system does:

- kernel is initially configured not to have an initramfs included
- build the user space root file system
- re-configure the kernel to have an initramfs included
(CONFIG_INITRAMFS_SOURCE="/path/to/romfs") and set relevant
CONFIG_INITRAMFS options, in my case, no compression
- kernel is re-built with these options -> kernel+initramfs image is copied
- kernel is re-built again without these options -> kernel image is copied

Now suppose you make changes to your root filesystem, like add/remove
applications, initramfs_data.cpio is now a stale file and go through the
same process again:

- build the kernel without an initramfs
- user space (re)build
- build the kernel with an initramfs

Building a kernel without an initramfs means setting this option:

CONFIG_INITRAMFS_SOURCE=""

whereas building a kernel with an initramfs means setting these options:

CONFIG_INITRAMFS_SOURCE="/home/fainelli/work/uclinux-rootfs/romfs
/home/fainelli/work/uclinux-rootfs/misc/initramfs.dev"
CONFIG_INITRAMFS_ROOT_UID=1000
CONFIG_INITRAMFS_ROOT_GID=1000
CONFIG_INITRAMFS_COMPRESSION_NONE=y
# CONFIG_INITRAMFS_COMPRESSION_GZIP is not set
# CONFIG_INITRAMFS_COMPRESSION_BZIP2 is not set
# CONFIG_INITRAMFS_COMPRESSION_LZMA is not set
# CONFIG_INITRAMFS_COMPRESSION_XZ is not set
# CONFIG_INITRAMFS_COMPRESSION_LZO is not set
# CONFIG_INITRAMFS_COMPRESSION_LZ4 is not set
CONFIG_INITRAMFS_COMPRESSION=""

Commit db2aa7fd15e857891cefbada8348c8d938c7a2bc ("initramfs: allow again
choice of the embedded initram compression algorithm") appears
problematic because CONFIG_INITRAMFS_COMPRESSION which is used to
determine the initramfs cpio extension/compression is a string, and due
to how Kconfig works, it will evaluate, in order, how to assign it.
Setting CONFIG_INITRAMFS_COMPRESSION_NONE with
CONFIG_INITRAMFS_SOURCE="" cannot possibly work (because of the depends
on INITRAMFS_SOURCE!="") yet we still manage to get it assigned to
something: ".gz" because CONFIG_RD_GZIP=y is set in my kernel, even when
there is no initramfs being built.

So we basically end-up generating two initramfs_data.cpio* files, one
without extension, and one with .gz. This causes usr/Makefile to track
usr/initramfs_data.cpio.gz, and not usr/initramfs_data.cpio, that is
also problematic after 9e3596b0c6539e28546ff7c72a06576627068353
("kbuild: initramfs cleanup, set target from Kconfig") 

Re: [PATCH 7/7] DWARF: add the config option

2017-05-20 Thread H.J. Lu
On Sat, May 20, 2017 at 2:58 PM, Andy Lutomirski  wrote:
> On Sat, May 20, 2017 at 1:01 PM, H.J. Lu  wrote:
>> On Sat, May 20, 2017 at 9:20 AM, Josh Poimboeuf  wrote:
>>

 (H.J., could we get a binutils feature that allows is to do:

 pushq %whatever
 .cfi_adjust_sp -8
 ...
 popq %whatever
 .cfi_adjust_sp 8

>>
>> Np.  Compiler needs to generate this.
>>
>
> How would the compiler generate this when inline asm is involved?  For
> the kernel, objtool could get around the need to have these
> annotations, but not so much for user code?  Is the compiler supposed
> to parse the inline asm?  Would the compiler provide some magic % code
> to represent the current CFA base register?

Here is one example of inline asm with call frame info:

https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/x86_64/sigaction.c;h=be058bac436d1cc9794b2b03107676ed99f6b872;hb=HEAD

-- 
H.J.


Re: [PATCH 7/7] DWARF: add the config option

2017-05-20 Thread H.J. Lu
On Sat, May 20, 2017 at 2:58 PM, Andy Lutomirski  wrote:
> On Sat, May 20, 2017 at 1:01 PM, H.J. Lu  wrote:
>> On Sat, May 20, 2017 at 9:20 AM, Josh Poimboeuf  wrote:
>>

 (H.J., could we get a binutils feature that allows is to do:

 pushq %whatever
 .cfi_adjust_sp -8
 ...
 popq %whatever
 .cfi_adjust_sp 8

>>
>> Np.  Compiler needs to generate this.
>>
>
> How would the compiler generate this when inline asm is involved?  For
> the kernel, objtool could get around the need to have these
> annotations, but not so much for user code?  Is the compiler supposed
> to parse the inline asm?  Would the compiler provide some magic % code
> to represent the current CFA base register?

Here is one example of inline asm with call frame info:

https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/x86_64/sigaction.c;h=be058bac436d1cc9794b2b03107676ed99f6b872;hb=HEAD

-- 
H.J.


[PATCH] ubifs: Wire-up statx() support

2017-05-20 Thread Richard Weinberger
statx() can report what flags a file has, expose flags that UBIFS
supports. Especially STATX_ATTR_COMPRESSED and STATX_ATTR_ENCRYPTED
can be interesting for userspace.

Signed-off-by: Richard Weinberger 
---
 fs/ubifs/dir.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c
index 566079d9b402..578b6043f308 100644
--- a/fs/ubifs/dir.c
+++ b/fs/ubifs/dir.c
@@ -1647,6 +1647,21 @@ int ubifs_getattr(const struct path *path, struct kstat 
*stat,
struct ubifs_inode *ui = ubifs_inode(inode);
 
mutex_lock(>ui_mutex);
+
+   if (ui->flags & UBIFS_APPEND_FL)
+   stat->attributes |= STATX_ATTR_APPEND;
+   if (ui->flags & UBIFS_COMPR_FL)
+   stat->attributes |= STATX_ATTR_COMPRESSED;
+   if (ui->flags & UBIFS_CRYPT_FL)
+   stat->attributes |= STATX_ATTR_ENCRYPTED;
+   if (ui->flags & UBIFS_IMMUTABLE_FL)
+   stat->attributes |= STATX_ATTR_IMMUTABLE;
+
+   stat->attributes_mask |= (STATX_ATTR_APPEND |
+   STATX_ATTR_COMPRESSED |
+   STATX_ATTR_ENCRYPTED |
+   STATX_ATTR_IMMUTABLE);
+
generic_fillattr(inode, stat);
stat->blksize = UBIFS_BLOCK_SIZE;
stat->size = ui->ui_size;
-- 
2.12.0



[PATCH] ubifs: Wire-up statx() support

2017-05-20 Thread Richard Weinberger
statx() can report what flags a file has, expose flags that UBIFS
supports. Especially STATX_ATTR_COMPRESSED and STATX_ATTR_ENCRYPTED
can be interesting for userspace.

Signed-off-by: Richard Weinberger 
---
 fs/ubifs/dir.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c
index 566079d9b402..578b6043f308 100644
--- a/fs/ubifs/dir.c
+++ b/fs/ubifs/dir.c
@@ -1647,6 +1647,21 @@ int ubifs_getattr(const struct path *path, struct kstat 
*stat,
struct ubifs_inode *ui = ubifs_inode(inode);
 
mutex_lock(>ui_mutex);
+
+   if (ui->flags & UBIFS_APPEND_FL)
+   stat->attributes |= STATX_ATTR_APPEND;
+   if (ui->flags & UBIFS_COMPR_FL)
+   stat->attributes |= STATX_ATTR_COMPRESSED;
+   if (ui->flags & UBIFS_CRYPT_FL)
+   stat->attributes |= STATX_ATTR_ENCRYPTED;
+   if (ui->flags & UBIFS_IMMUTABLE_FL)
+   stat->attributes |= STATX_ATTR_IMMUTABLE;
+
+   stat->attributes_mask |= (STATX_ATTR_APPEND |
+   STATX_ATTR_COMPRESSED |
+   STATX_ATTR_ENCRYPTED |
+   STATX_ATTR_IMMUTABLE);
+
generic_fillattr(inode, stat);
stat->blksize = UBIFS_BLOCK_SIZE;
stat->size = ui->ui_size;
-- 
2.12.0



Re: [PATCH 7/7] DWARF: add the config option

2017-05-20 Thread Andy Lutomirski
On Sat, May 20, 2017 at 1:01 PM, H.J. Lu  wrote:
> On Sat, May 20, 2017 at 9:20 AM, Josh Poimboeuf  wrote:
>
>>>
>>> (H.J., could we get a binutils feature that allows is to do:
>>>
>>> pushq %whatever
>>> .cfi_adjust_sp -8
>>> ...
>>> popq %whatever
>>> .cfi_adjust_sp 8
>>>
>
> Np.  Compiler needs to generate this.
>

How would the compiler generate this when inline asm is involved?  For
the kernel, objtool could get around the need to have these
annotations, but not so much for user code?  Is the compiler supposed
to parse the inline asm?  Would the compiler provide some magic % code
to represent the current CFA base register?


Re: [PATCH 7/7] DWARF: add the config option

2017-05-20 Thread Andy Lutomirski
On Sat, May 20, 2017 at 1:01 PM, H.J. Lu  wrote:
> On Sat, May 20, 2017 at 9:20 AM, Josh Poimboeuf  wrote:
>
>>>
>>> (H.J., could we get a binutils feature that allows is to do:
>>>
>>> pushq %whatever
>>> .cfi_adjust_sp -8
>>> ...
>>> popq %whatever
>>> .cfi_adjust_sp 8
>>>
>
> Np.  Compiler needs to generate this.
>

How would the compiler generate this when inline asm is involved?  For
the kernel, objtool could get around the need to have these
annotations, but not so much for user code?  Is the compiler supposed
to parse the inline asm?  Would the compiler provide some magic % code
to represent the current CFA base register?


Re: [PATCH 7/7] DWARF: add the config option

2017-05-20 Thread Andy Lutomirski
On Sat, May 20, 2017 at 1:16 PM, Linus Torvalds
 wrote:
> On Fri, May 19, 2017 at 10:23 PM, Andy Lutomirski  wrote:
>>
>> I personally like the idea of using real DWARF annotations in the
>> entry code because it makes gdb work better (not kgdb -- real gdb
>> attached to KVM).  I bet that we could get entry asm annotations into
>> good shape if we really wanted to.  OTOH, getting DWARF to work well
>> for inline asm is really nasty IIRC.
>
> No. I will NAK *any* attempt to make our asm contain the crazy
> shit-for-brains annotations.
>
> Been there, done that, got the T-shirt, and then doused the T-shirt in
> gasoline and put it on fire.
>
> The amount of unreadable crap and bugs it requires is not worth the
> pain. Not for *any* amount of gain, and the gain here is basically
> zero.

But what if objtool autogenerated the annotations, perhaps with a tiny
bit of help telling it "hardware frame goes here" or "pt_regs goes
here"?


Re: [PATCH 7/7] DWARF: add the config option

2017-05-20 Thread Andy Lutomirski
On Sat, May 20, 2017 at 1:16 PM, Linus Torvalds
 wrote:
> On Fri, May 19, 2017 at 10:23 PM, Andy Lutomirski  wrote:
>>
>> I personally like the idea of using real DWARF annotations in the
>> entry code because it makes gdb work better (not kgdb -- real gdb
>> attached to KVM).  I bet that we could get entry asm annotations into
>> good shape if we really wanted to.  OTOH, getting DWARF to work well
>> for inline asm is really nasty IIRC.
>
> No. I will NAK *any* attempt to make our asm contain the crazy
> shit-for-brains annotations.
>
> Been there, done that, got the T-shirt, and then doused the T-shirt in
> gasoline and put it on fire.
>
> The amount of unreadable crap and bugs it requires is not worth the
> pain. Not for *any* amount of gain, and the gain here is basically
> zero.

But what if objtool autogenerated the annotations, perhaps with a tiny
bit of help telling it "hardware frame goes here" or "pt_regs goes
here"?


Re: CHIPPro NAND issue with 4.12 rc1

2017-05-20 Thread Angus Ainslie

On 2017-05-20 09:14, Boris Brezillon wrote:

Le Sat, 20 May 2017 08:49:04 -0600,
Angus Ainslie  a écrit :


Hi All,

I'm trying to boot a CHIPPro with the stock 4.12 rc1 kernel. If I make
no modifications to the sun5i-gr8-chip-pro.dtb the kernel boots but
can't find the root partition.

So I added the partitions to the dts file

diff --git a/arch/arm/boot/dts/sun5i-gr8-chip-pro.dts
b/arch/arm/boot/dts/sun5i-gr8-chip-pro.dts
index c55b11a..0e61e6b 100644
--- a/arch/arm/boot/dts/sun5i-gr8-chip-pro.dts
+++ b/arch/arm/boot/dts/sun5i-gr8-chip-pro.dts
@@ -146,6 +146,32 @@
 reg = <0>;
 allwinner,rb = <0>;
 nand-ecc-mode = "hw";
+nand-on-flash-bbt;
+
+spl@0 {
+  label = "SPL";
+  reg = /bits/ 64 <0x0 0x40>;
+};
+
+spl-backup@40 {
+  label = "SPL.backup";
+  reg = /bits/ 64 <0x40 0x40>;
+};
+
+u-boot@80 {
+  label = "U-Boot";
+  reg = /bits/ 64 <0x80 0x40>;
+};
+
+env@c0 {
+  label = "env";
+  reg = /bits/ 64 <0xc0 0x40>;
+};
+
+rootfs@100 {
+  label = "rootfs";
+  reg = /bits/ 64 <0x100 0x1f00>;
+};
 };
  };

and now the kernel finds the partition but it times out trying to 
mount

it. It seems to be something in the dts files because if I use the
ntc-gr8-crumb.dts from the ntc 4.4.30 kernel then the system boots all
the way to userland.


Hm, that's weird. Just changing the dtb makes it work? Did you try to
dump both dtbs and figure out what else changes?



Yeah I thought it was weird too. I was thinking that maybe the pin muxes 
were getting changed and the rb net or the interrupt net was getting 
changed to a different function.


I did decompile to 2 dtb's and I couldn't find many differences. They 
were mostly around some pull ups and drive strength for some of the NAND 
and i2c pins. I tried adding those changes and it still didn't work so I 
went back to the minimal set of changes to reproduce the bug.



Also, I wonder how the NAND is correctly detected without this patch
[1].




That patch seems to be in my 4.12-rc1 kernel, I have a definition for 
the TC58NVG2S0H.




[7.13] ubi0: scanning is finished
[7.15] ubi0: attached mtd4 (name "rootfs", size 496 MiB)
[7.16] ubi0: PEB size: 262144 bytes (256 KiB), LEB size: 
258048

bytes
[7.17] ubi0: min./max. I/O unit sizes: 4096/4096, sub-page 
size

1024
[7.18] ubi0: VID header offset: 1024 (aligned 1024), data
offset: 4096
[7.19] ubi0: good PEBs: 1977, bad PEBs: 7, corrupted PEBs: 0
[7.20] ubi0: user volume: 1, internal volumes: 1, max. volumes
count: 128
[7.21] ubi0: max/mean erase counter: 3/1, WL threshold: 4096,
image sequence number: 177407
[7.22] ubi0: available PEBs: 1, total reserved PEBs: 1976, 
PEBs

reserved for bad PEB handling: 33


UBI attach works...


[7.24] hctosys: unable to open rtc device (rtc0)
[7.25] vcc3v0: disabling
[7.25] ALSA device list:
[7.26]   #0: sun4i-codec
[7.26] ubi0: background thread "ubi_bgt0d" started, PID 53
[8.32] sunxi_nand 1c03000.nand: wait interrupt timedout
[9.32] sunxi_nand 1c03000.nand: wait interrupt timedout
[   10.33] sunxi_nand 1c03000.nand: wait for empty cmd FIFO 
timedout
[   11.34] sunxi_nand 1c03000.nand: wait for empty cmd FIFO 
timedout
[   12.35] sunxi_nand 1c03000.nand: wait for empty cmd FIFO 
timedout
[   13.36] sunxi_nand 1c03000.nand: wait for empty cmd FIFO 
timedout
[   14.37] sunxi_nand 1c03000.nand: wait for empty cmd FIFO 
timedout
[   14.38] ubi0 warning: ubi_io_read: error -110 while reading 
4096

bytes from PEB 1034:4096, read only 0 bytes, retry


And suddenly you get timeouts. That's really weird.



Is there anything I can do on this end to help debug ?




[1]https://github.com/NextThingCo/linux/commit/5ebc35ce1223ef14ace9479d5f97d0fce979e550




Re: CHIPPro NAND issue with 4.12 rc1

2017-05-20 Thread Angus Ainslie

On 2017-05-20 09:14, Boris Brezillon wrote:

Le Sat, 20 May 2017 08:49:04 -0600,
Angus Ainslie  a écrit :


Hi All,

I'm trying to boot a CHIPPro with the stock 4.12 rc1 kernel. If I make
no modifications to the sun5i-gr8-chip-pro.dtb the kernel boots but
can't find the root partition.

So I added the partitions to the dts file

diff --git a/arch/arm/boot/dts/sun5i-gr8-chip-pro.dts
b/arch/arm/boot/dts/sun5i-gr8-chip-pro.dts
index c55b11a..0e61e6b 100644
--- a/arch/arm/boot/dts/sun5i-gr8-chip-pro.dts
+++ b/arch/arm/boot/dts/sun5i-gr8-chip-pro.dts
@@ -146,6 +146,32 @@
 reg = <0>;
 allwinner,rb = <0>;
 nand-ecc-mode = "hw";
+nand-on-flash-bbt;
+
+spl@0 {
+  label = "SPL";
+  reg = /bits/ 64 <0x0 0x40>;
+};
+
+spl-backup@40 {
+  label = "SPL.backup";
+  reg = /bits/ 64 <0x40 0x40>;
+};
+
+u-boot@80 {
+  label = "U-Boot";
+  reg = /bits/ 64 <0x80 0x40>;
+};
+
+env@c0 {
+  label = "env";
+  reg = /bits/ 64 <0xc0 0x40>;
+};
+
+rootfs@100 {
+  label = "rootfs";
+  reg = /bits/ 64 <0x100 0x1f00>;
+};
 };
  };

and now the kernel finds the partition but it times out trying to 
mount

it. It seems to be something in the dts files because if I use the
ntc-gr8-crumb.dts from the ntc 4.4.30 kernel then the system boots all
the way to userland.


Hm, that's weird. Just changing the dtb makes it work? Did you try to
dump both dtbs and figure out what else changes?



Yeah I thought it was weird too. I was thinking that maybe the pin muxes 
were getting changed and the rb net or the interrupt net was getting 
changed to a different function.


I did decompile to 2 dtb's and I couldn't find many differences. They 
were mostly around some pull ups and drive strength for some of the NAND 
and i2c pins. I tried adding those changes and it still didn't work so I 
went back to the minimal set of changes to reproduce the bug.



Also, I wonder how the NAND is correctly detected without this patch
[1].




That patch seems to be in my 4.12-rc1 kernel, I have a definition for 
the TC58NVG2S0H.




[7.13] ubi0: scanning is finished
[7.15] ubi0: attached mtd4 (name "rootfs", size 496 MiB)
[7.16] ubi0: PEB size: 262144 bytes (256 KiB), LEB size: 
258048

bytes
[7.17] ubi0: min./max. I/O unit sizes: 4096/4096, sub-page 
size

1024
[7.18] ubi0: VID header offset: 1024 (aligned 1024), data
offset: 4096
[7.19] ubi0: good PEBs: 1977, bad PEBs: 7, corrupted PEBs: 0
[7.20] ubi0: user volume: 1, internal volumes: 1, max. volumes
count: 128
[7.21] ubi0: max/mean erase counter: 3/1, WL threshold: 4096,
image sequence number: 177407
[7.22] ubi0: available PEBs: 1, total reserved PEBs: 1976, 
PEBs

reserved for bad PEB handling: 33


UBI attach works...


[7.24] hctosys: unable to open rtc device (rtc0)
[7.25] vcc3v0: disabling
[7.25] ALSA device list:
[7.26]   #0: sun4i-codec
[7.26] ubi0: background thread "ubi_bgt0d" started, PID 53
[8.32] sunxi_nand 1c03000.nand: wait interrupt timedout
[9.32] sunxi_nand 1c03000.nand: wait interrupt timedout
[   10.33] sunxi_nand 1c03000.nand: wait for empty cmd FIFO 
timedout
[   11.34] sunxi_nand 1c03000.nand: wait for empty cmd FIFO 
timedout
[   12.35] sunxi_nand 1c03000.nand: wait for empty cmd FIFO 
timedout
[   13.36] sunxi_nand 1c03000.nand: wait for empty cmd FIFO 
timedout
[   14.37] sunxi_nand 1c03000.nand: wait for empty cmd FIFO 
timedout
[   14.38] ubi0 warning: ubi_io_read: error -110 while reading 
4096

bytes from PEB 1034:4096, read only 0 bytes, retry


And suddenly you get timeouts. That's really weird.



Is there anything I can do on this end to help debug ?




[1]https://github.com/NextThingCo/linux/commit/5ebc35ce1223ef14ace9479d5f97d0fce979e550




  1   2   3   4   5   >