Re: [PATCH RFC 1/2] sched: Minimize the idle cpu selection race window.

2017-10-31 Thread Mike Galbraith
On Wed, 2017-11-01 at 01:08 -0500, Atish Patra wrote:
> 
> On 10/31/2017 03:48 AM, Mike Galbraith wrote:
> 
> > I played with something ~similar (cmpxchg() idle cpu reservation)
> I had an atomic version earlier as well. Peter's suggestion for per cpu 
> seems to perform slightly better than atomic.

Yeah, no doubt about that.

> Thus, this patch has the per cpu version.
> >   a
> > while back in the context of schbench, and it did help that,
> Do you have the schbench configuration somewhere that I can test? I 
> tried various configurations but did not
> see any improvement or regression.

No, as noted, I didn't save anything.  I watched and fiddled with
several configurations on various sized boxen, doing this/that on top
of the reservation thing to try to improve the "it's 100% about perfect
spread" schbench, but didn't like the pricetag, so tossed the lot over
my shoulder and walked away.

-Mike


linux-next: manual merge of the tip tree with the sound tree

2017-10-31 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the tip tree got conflicts in:

  sound/oss/midibuf.c
  sound/oss/soundcard.c
  sound/oss/sys_timer.c
  sound/oss/uart6850.c

between commit:

  727dede0ba8a ("sound: Retire OSS")

from the sound tree and commit:

  1d27e3e2252b ("timer: Remove expires and data arguments from DEFINE_TIMER")

from the tip tree.

I fixed it up (I just deleted the files) and can carry the fix as
necessary. This is now fixed as far as linux-next is concerned, but any
non trivial conflicts should be mentioned to your upstream maintainer
when your tree is submitted for merging.  You may also want to consider
cooperating with the maintainer of the conflicting tree to minimise any
particularly complex conflicts.

-- 
Cheers,
Stephen Rothwell


Re: [PATCH v3] usb: wusbcore: Use put_unaligned_le32

2017-10-31 Thread Greg KH
On Tue, Oct 31, 2017 at 11:41:45PM +0530, Himanshu Jha wrote:
> On Tue, Oct 17, 2017 at 05:14:30PM +0530, Himanshu Jha wrote:
> 
> Hi Greg,
> 
> > Use put_unaligned_le32 rather than using byte ordering function and
> > memcpy which makes code clear.
> > Also, add the header file where it is declared.
> >
> 
> I hope my patch is in your queue!

You will get an email when it happens...


Re: [PATCH] net: recvmsg: Unconditionally zero struct sockaddr_storage

2017-10-31 Thread Willy Tarreau
On Tue, Oct 31, 2017 at 09:14:45AM -0700, Kees Cook wrote:
> diff --git a/net/socket.c b/net/socket.c
> index c729625eb5d3..34183f4fbdf8 100644
> --- a/net/socket.c
> +++ b/net/socket.c
> @@ -2188,6 +2188,7 @@ static int ___sys_recvmsg(struct socket *sock, struct 
> user_msghdr __user *msg,
>   struct sockaddr __user *uaddr;
>   int __user *uaddr_len = COMPAT_NAMELEN(msg);
>  
> + memset(&addr, 0, sizeof(addr));
>   msg_sys->msg_name = &addr;

Isn't this going to cause a performance hit in the fast path ? Just
checking, I have not read the whole code with the patch in its context.

Willy


Re: [PATCH v2] hv: kvp: Avoid reading past allocated blocks from KVP file

2017-10-31 Thread Greg KH
On Tue, Oct 31, 2017 at 01:02:35PM -0700, Long Li wrote:
> From: Paul Meyer 
> 
> While reading in more than one block (50) of KVP records, the allocation
> goes per block, but the reads used the total number of allocated records
> (without resetting the pointer/stream). This causes the records buffer to
> overrun when the refresh reads more than one block over the previous
> capacity (e.g. reading more than 100 KVP records whereas the in-memory
> database was empty before).
> 
> Fix this by reading the correct number of KVP records from file each time.
> 
> Signed-off-by: Paul Meyer 
> Signed-off-by: Long Li 
> ---
>  tools/hv/hv_kvp_daemon.c | 66 
> 
>  1 file changed, 10 insertions(+), 56 deletions(-)

When you version a patch, you always have to say what changed below the
--- line, as the documentation states to do...

v3? :)

thanks,

greg k-h


Re: Nokia N9: fun with camera

2017-10-31 Thread Pavel Machek
Hi!

> Sakari, I am actually playing with N9 camera, not N950. That comes
> next.
> 
> And the clock error I mentioned ... seems to be
> -EPROBE_DEFER. So... not an issue.

Hmm, and with similar config, I got N950 to work. ... which should
give me enough clues to get N9 to work. I guess I forgot to reset the
pipeline between the tries, or something.

For the record, this got me some data on n950:

 m.media_ctl( [ '-f', '"OMAP3 ISP CSI2a":0 [fmt:%s/%dx%d]' % (m.fmt, m.cap_x, 
m.cap_y) ] )
 m.media_ctl( [ '-l', '"OMAP3 ISP CSI2a":1 -> "OMAP3 ISP CSI2a output":0[1]' ] )

 # WORKS
 # pavel@n900:~/g/tui/camera$ sudo /my/tui/yavta/yavta
 # --capture=8 --skip 0 --format SGRBG10 --size 4272x3016 /dev/video1 
--file=/tmp/delme#

...ouch. It only worked twice :-(. Either driver gets confused by my
attempts, or it relied on some other initialization code. Strange.

Best regards,
Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: [PATCH 2/2] mm:swap: unify cluster-based and vma-based swap readahead

2017-10-31 Thread Minchan Kim
On Wed, Nov 01, 2017 at 02:17:17PM +0800, Huang, Ying wrote:
> Minchan Kim  writes:
> 
> > This patch makes do_swap_page no need to be aware of two different
> > swap readahead algorithm. Just unify cluster-based and vma-based
> > readahead function call.
> >
> > Signed-off-by: Minchan Kim 
> > ---
> >  include/linux/swap.h | 17 -
> >  mm/memory.c  | 11 ---
> >  mm/shmem.c   |  5 -
> >  mm/swap_state.c  | 21 +++--
> >  4 files changed, 35 insertions(+), 19 deletions(-)
> >
> > diff --git a/include/linux/swap.h b/include/linux/swap.h
> > index 7c7c8b344bc9..9cc330360eac 100644
> > --- a/include/linux/swap.h
> > +++ b/include/linux/swap.h
> > @@ -425,9 +425,11 @@ extern struct page *read_swap_cache_async(swp_entry_t, 
> > gfp_t,
> >  extern struct page *__read_swap_cache_async(swp_entry_t, gfp_t,
> > struct vm_area_struct *vma, unsigned long addr,
> > bool *new_page_allocated);
> > -extern struct page *swapin_readahead(swp_entry_t, gfp_t,
> > -   struct vm_area_struct *vma, unsigned long addr);
> > -extern struct page *do_swap_page_readahead(swp_entry_t fentry, gfp_t 
> > gfp_mask,
> > +extern struct page *cluster_readahead(swp_entry_t entry, gfp_t flag,
> > +   struct vm_fault *vmf);
> 
> In addition to swap readahead, there are file readahead too.  So better
> add swap in name, such as swap_cluster_readahead()?

Yub.

> 
> > +extern struct page *swapin_readahead(swp_entry_t entry, gfp_t flag,
> > +   struct vm_fault *vmf);
> > +extern struct page *vma_readahead(swp_entry_t fentry, gfp_t gfp_mask,
> >struct vm_fault *vmf);
> 
> I don't find vma_readahead() is used outside of page_state.c, why
> declare it here?

By wrapping function, it's pointless to declare.
Yub, Let's drop it.


> 
> >  
> >  /* linux/mm/swapfile.c */
> > @@ -536,8 +538,13 @@ static inline void put_swap_page(struct page *page, 
> > swp_entry_t swp)
> >  {
> >  }
> >  
> > +static inline struct page *cluster_readahead(swp_entry_t, gfp_t gfp_mask
> > +   struct vm_fault *vmf)
> > +{
> > +}
> > +
> >  static inline struct page *swapin_readahead(swp_entry_t swp, gfp_t 
> > gfp_mask,
> > -   struct vm_area_struct *vma, unsigned long addr)
> > +   struct vm_fault *vmf)
> >  {
> > return NULL;
> >  }
> > @@ -547,7 +554,7 @@ static inline bool swap_use_vma_readahead(void)
> > return false;
> >  }
> 
> Now swap_use_vma_readahead() is used in swap_state.c only, so we can
> remove it from the header file?

Will do.

> 
> > -static inline struct page *do_swap_page_readahead(swp_entry_t fentry,
> > +static inline struct page *vma_readahead(swp_entry_t fentry,
> > gfp_t gfp_mask, struct vm_fault *vmf)
> >  {
> > return NULL;
> > diff --git a/mm/memory.c b/mm/memory.c
> > index e955298e4290..ce5e3d7ccc5c 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -2889,7 +2889,8 @@ int do_swap_page(struct vm_fault *vmf)
> > if (si->flags & SWP_SYNCHRONOUS_IO &&
> > __swap_count(si, entry) == 1) {
> > /* skip swapcache */
> > -   page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma, 
> > vmf->address);
> > +   page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma,
> > +   vmf->address);
> > if (page) {
> > __SetPageLocked(page);
> > __SetPageSwapBacked(page);
> > @@ -2898,12 +2899,8 @@ int do_swap_page(struct vm_fault *vmf)
> > swap_readpage(page, true);
> > }
> > } else {
> > -   if (swap_use_vma_readahead())
> > -   page = do_swap_page_readahead(entry,
> > -   GFP_HIGHUSER_MOVABLE, vmf);
> > -   else
> > -   page = swapin_readahead(entry,
> > -  GFP_HIGHUSER_MOVABLE, vma, vmf->address);
> > +   page = swapin_readahead(entry, GFP_HIGHUSER_MOVABLE,
> > +   vmf);
> > swapcache = page;
> > }
> >  
> > diff --git a/mm/shmem.c b/mm/shmem.c
> > index 62dfdc097e44..2522bc0958e1 100644
> > --- a/mm/shmem.c
> > +++ b/mm/shmem.c
> > @@ -1413,9 +1413,12 @@ static struct page *shmem_swapin(swp_entry_t swap, 
> > gfp_t gfp,
> >  {
> > struct vm_area_struct pvma;
> > struct page *page;
> > +   struct vm_fault vmf;
> >  
> > shmem_pseudo_vma_init(&pvma, info, index);
> > -   page = swapin_readahead(swap, gfp, &pvma, 0);
> > +   vmf.vma = &pvma;
> > +   vmf.address = 0;
> > +   page = cluster_readahead(swap, gfp, &vmf);
> > shmem_pseudo_vma_destroy(&pvm

Re: [PATCH] at24: support eeproms that do not roll over page reads.

2017-10-31 Thread kbuild test robot
Hi Sven,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v4.14-rc7]
[cannot apply to next-20171018]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Sven-Van-Asbroeck/at24-support-eeproms-that-do-not-roll-over-page-reads/20171101-114231
reproduce:
# apt-get install sparse
make ARCH=x86_64 allmodconfig
make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)


vim +210 drivers/misc/eeprom/at24.c

   185  
   186  /*
   187   * This routine supports chips which consume multiple I2C addresses. It
   188   * computes the addressing information to be used for a given r/w 
request.
   189   * Assumes that sanity checks for offset happened at sysfs-layer.
   190   *
   191   * Slave address and byte offset derive from the offset. Always
   192   * set the byte address; on a multi-master board, another master
   193   * may have changed the chip's "current" address pointer.
   194   *
   195   * In case of chips that don't rollover page reads, truncate the count
   196   * to the nearest page boundary. This might result in the
   197   * at24_eeprom_read_XXX functions reading fewer bytes than requested,
   198   * but this is compensated for in at24_read().
   199   */
   200  static struct i2c_client *at24_translate_offset(struct at24_data *at24,
   201  unsigned int *offset, size_t *count)
   202  {
   203  unsigned int i, bits, remainder;
   204  
   205  bits = (at24->chip.flags & AT24_FLAG_ADDR16) ? 16 : 8;
   206  i = *offset >> bits;
   207  *offset &= AT24_BITMASK(bits);
   208  if ((at24->chip.flags & AT24_FLAG_NO_RDROL) && count) {
   209  remainder = BIT(bits) - *offset;
 > 210  *count = min(*count, remainder);
   211  }
   212  
   213  return at24->client[i];
   214  }
   215  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


Re: [PATCH 2/2] mm:swap: unify cluster-based and vma-based swap readahead

2017-10-31 Thread Huang, Ying
Minchan Kim  writes:

> This patch makes do_swap_page no need to be aware of two different
> swap readahead algorithm. Just unify cluster-based and vma-based
> readahead function call.
>
> Signed-off-by: Minchan Kim 
> ---
>  include/linux/swap.h | 17 -
>  mm/memory.c  | 11 ---
>  mm/shmem.c   |  5 -
>  mm/swap_state.c  | 21 +++--
>  4 files changed, 35 insertions(+), 19 deletions(-)
>
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index 7c7c8b344bc9..9cc330360eac 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -425,9 +425,11 @@ extern struct page *read_swap_cache_async(swp_entry_t, 
> gfp_t,
>  extern struct page *__read_swap_cache_async(swp_entry_t, gfp_t,
>   struct vm_area_struct *vma, unsigned long addr,
>   bool *new_page_allocated);
> -extern struct page *swapin_readahead(swp_entry_t, gfp_t,
> - struct vm_area_struct *vma, unsigned long addr);
> -extern struct page *do_swap_page_readahead(swp_entry_t fentry, gfp_t 
> gfp_mask,
> +extern struct page *cluster_readahead(swp_entry_t entry, gfp_t flag,
> + struct vm_fault *vmf);

In addition to swap readahead, there are file readahead too.  So better
add swap in name, such as swap_cluster_readahead()?

> +extern struct page *swapin_readahead(swp_entry_t entry, gfp_t flag,
> + struct vm_fault *vmf);
> +extern struct page *vma_readahead(swp_entry_t fentry, gfp_t gfp_mask,
>  struct vm_fault *vmf);

I don't find vma_readahead() is used outside of page_state.c, why
declare it here?

>  
>  /* linux/mm/swapfile.c */
> @@ -536,8 +538,13 @@ static inline void put_swap_page(struct page *page, 
> swp_entry_t swp)
>  {
>  }
>  
> +static inline struct page *cluster_readahead(swp_entry_t, gfp_t gfp_mask
> + struct vm_fault *vmf)
> +{
> +}
> +
>  static inline struct page *swapin_readahead(swp_entry_t swp, gfp_t gfp_mask,
> - struct vm_area_struct *vma, unsigned long addr)
> + struct vm_fault *vmf)
>  {
>   return NULL;
>  }
> @@ -547,7 +554,7 @@ static inline bool swap_use_vma_readahead(void)
>   return false;
>  }

Now swap_use_vma_readahead() is used in swap_state.c only, so we can
remove it from the header file?

> -static inline struct page *do_swap_page_readahead(swp_entry_t fentry,
> +static inline struct page *vma_readahead(swp_entry_t fentry,
>   gfp_t gfp_mask, struct vm_fault *vmf)
>  {
>   return NULL;
> diff --git a/mm/memory.c b/mm/memory.c
> index e955298e4290..ce5e3d7ccc5c 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2889,7 +2889,8 @@ int do_swap_page(struct vm_fault *vmf)
>   if (si->flags & SWP_SYNCHRONOUS_IO &&
>   __swap_count(si, entry) == 1) {
>   /* skip swapcache */
> - page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma, 
> vmf->address);
> + page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma,
> + vmf->address);
>   if (page) {
>   __SetPageLocked(page);
>   __SetPageSwapBacked(page);
> @@ -2898,12 +2899,8 @@ int do_swap_page(struct vm_fault *vmf)
>   swap_readpage(page, true);
>   }
>   } else {
> - if (swap_use_vma_readahead())
> - page = do_swap_page_readahead(entry,
> - GFP_HIGHUSER_MOVABLE, vmf);
> - else
> - page = swapin_readahead(entry,
> -GFP_HIGHUSER_MOVABLE, vma, vmf->address);
> + page = swapin_readahead(entry, GFP_HIGHUSER_MOVABLE,
> + vmf);
>   swapcache = page;
>   }
>  
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 62dfdc097e44..2522bc0958e1 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -1413,9 +1413,12 @@ static struct page *shmem_swapin(swp_entry_t swap, 
> gfp_t gfp,
>  {
>   struct vm_area_struct pvma;
>   struct page *page;
> + struct vm_fault vmf;
>  
>   shmem_pseudo_vma_init(&pvma, info, index);
> - page = swapin_readahead(swap, gfp, &pvma, 0);
> + vmf.vma = &pvma;
> + vmf.address = 0;
> + page = cluster_readahead(swap, gfp, &vmf);
>   shmem_pseudo_vma_destroy(&pvma);
>  
>   return page;
> diff --git a/mm/swap_state.c b/mm/swap_state.c
> index e3c535fcd2df..5ee53d4ee047 100644
> --- a/mm/swap_state.c
> +++ b/mm/swap_state.c
> @@ -538,11 +538,10 @@ static unsigned long swapin_nr_pages(unsigned long 
> offset)
>  }
>  
>  /**
> - * swapin_readahead - swap in p

Re: [PATCH RFC 1/2] sched: Minimize the idle cpu selection race window.

2017-10-31 Thread Atish Patra



On 10/31/2017 03:48 AM, Mike Galbraith wrote:

On Tue, 2017-10-31 at 09:20 +0100, Peter Zijlstra wrote:

On Tue, Oct 31, 2017 at 12:27:41AM -0500, Atish Patra wrote:

Currently, multiple tasks can wakeup on same cpu from
select_idle_sibiling() path in case they wakeup simulatenously
and last ran on the same llc. This happens because an idle cpu
is not updated until idle task is scheduled out. Any task waking
during that period may potentially select that cpu for a wakeup
candidate.

Introduce a per cpu variable that is set as soon as a cpu is
selected for wakeup for any task. This prevents from other tasks
to select the same cpu again. Note: This does not close the race
window but minimizes it to accessing the per-cpu variable. If two
wakee tasks access the per cpu variable at the same time, they may
select the same cpu again. But it minimizes the race window
considerably.

The very most important question; does it actually help? What
benchmarks, give what numbers?
Here are the numbers from one of the OLTP configuration on a 8 socket 
x86 machine

kernel  txn/minute (normalized)user/sys
baseline  1.0  80/5
pcpu1.021  84/5

The throughput gains are not very high and close to run-to-run variation %.
The schedstat data (added for testing in 2/2 patch) indicates the there 
are many instances of the
race conditions that got addressed but may be not enough to trigger a 
significant throughput change.


All other benchmark I tested (TPCC, hackbench, schbench, swingbench) did 
not show any regression.


I will let Joel post numbers from Android benchmarks.

I played with something ~similar (cmpxchg() idle cpu reservation)
I had an atomic version earlier as well. Peter's suggestion for per cpu 
seems to perform slightly better than atomic.

Thus, this patch has the per cpu version.

  a
while back in the context of schbench, and it did help that,
Do you have the schbench configuration somewhere that I can test? I 
tried various configurations but did not

see any improvement or regression.

but for
generic fast mover benchmarks, the added overhead had the expected
effect, it shaved throughput a wee bit (rob Peter, pay Paul, repeat).

which benchmark ? Is it hackbench or something else ?
I have not found any regression yet in my testing. I would be happy to 
test if any other benchmark or different configuration

for hackbench.

Regards,
Atish

I still have the patch lying about in my rubbish heap, but didn't
bother to save any of the test results.

-Mike






linux-next: manual merge of the tip tree with the block tree

2017-10-31 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the tip tree got a conflict in:

  drivers/block/amiflop.c

between commit:

  f37ecbfc238b ("amifloppy: Convert timers to use timer_setup()")

from the block tree and commit:

  3c557df67257 ("timer: Remove meaningless .data/.function assignments")

from the tip tree.

I fixed it up (I just used the former version of the motor_on_callback
definition) and can carry the fix as necessary. This is now fixed as
far as linux-next is concerned, but any non trivial conflicts should be
mentioned to your upstream maintainer when your tree is submitted for
merging.  You may also want to consider cooperating with the maintainer
of the conflicting tree to minimise any particularly complex conflicts.

-- 
Cheers,
Stephen Rothwell


Re: [PATCH 1/2] mm:swap: clean up swap readahead

2017-10-31 Thread Huang, Ying
Minchan Kim  writes:

> Hi Huang,
>
> On Wed, Nov 01, 2017 at 01:41:00PM +0800, Huang, Ying wrote:
>> Hi, Minchan,
>> 
>> Minchan Kim  writes:
>> 
>> > When I see recent change of swap readahead, I am very unhappy
>> > about current code structure which diverges two swap readahead
>> > algorithm in do_swap_page. This patch is to clean it up.
>> >
>> > Main motivation is that fault handler doesn't need to be aware of
>> > readahead algorithms but just should call swapin_readahead.
>> >
>> > As first step, this patch cleans up a little bit but not perfect
>> > (I just separate for review easier) so next patch will make the goal
>> > complete.
>> >
>> > Signed-off-by: Minchan Kim 
>> > ---
>> >  include/linux/swap.h | 17 ++
>> >  mm/memory.c  | 17 +++---
>> >  mm/swap_state.c  | 89 
>> > 
>> >  3 files changed, 55 insertions(+), 68 deletions(-)
>> >
>> > diff --git a/include/linux/swap.h b/include/linux/swap.h
>> > index 84255b3da7c1..7c7c8b344bc9 100644
>> > --- a/include/linux/swap.h
>> > +++ b/include/linux/swap.h
>> > @@ -427,12 +427,8 @@ extern struct page 
>> > *__read_swap_cache_async(swp_entry_t, gfp_t,
>> >bool *new_page_allocated);
>> >  extern struct page *swapin_readahead(swp_entry_t, gfp_t,
>> >struct vm_area_struct *vma, unsigned long addr);
>> > -
>> > -extern struct page *swap_readahead_detect(struct vm_fault *vmf,
>> > -struct vma_swap_readahead *swap_ra);
>> >  extern struct page *do_swap_page_readahead(swp_entry_t fentry, gfp_t 
>> > gfp_mask,
>> > - struct vm_fault *vmf,
>> > - struct vma_swap_readahead *swap_ra);
>> > + struct vm_fault *vmf);
>> >  
>> >  /* linux/mm/swapfile.c */
>> >  extern atomic_long_t nr_swap_pages;
>> > @@ -551,15 +547,8 @@ static inline bool swap_use_vma_readahead(void)
>> >return false;
>> >  }
>> >  
>> > -static inline struct page *swap_readahead_detect(
>> > -  struct vm_fault *vmf, struct vma_swap_readahead *swap_ra)
>> > -{
>> > -  return NULL;
>> > -}
>> > -
>> > -static inline struct page *do_swap_page_readahead(
>> > -  swp_entry_t fentry, gfp_t gfp_mask,
>> > -  struct vm_fault *vmf, struct vma_swap_readahead *swap_ra)
>> > +static inline struct page *do_swap_page_readahead(swp_entry_t fentry,
>> > +  gfp_t gfp_mask, struct vm_fault *vmf)
>> >  {
>> >return NULL;
>> >  }
>> > diff --git a/mm/memory.c b/mm/memory.c
>> > index 8a0c410037d2..e955298e4290 100644
>> > --- a/mm/memory.c
>> > +++ b/mm/memory.c
>> > @@ -2849,21 +2849,14 @@ int do_swap_page(struct vm_fault *vmf)
>> >struct vm_area_struct *vma = vmf->vma;
>> >struct page *page = NULL, *swapcache = NULL;
>> >struct mem_cgroup *memcg;
>> > -  struct vma_swap_readahead swap_ra;
>> >swp_entry_t entry;
>> >pte_t pte;
>> >int locked;
>> >int exclusive = 0;
>> >int ret = 0;
>> > -  bool vma_readahead = swap_use_vma_readahead();
>> >  
>> > -  if (vma_readahead)
>> > -  page = swap_readahead_detect(vmf, &swap_ra);
>> > -  if (!pte_unmap_same(vma->vm_mm, vmf->pmd, vmf->pte, vmf->orig_pte)) {
>> > -  if (page)
>> > -  put_page(page);
>> > +  if (!pte_unmap_same(vma->vm_mm, vmf->pmd, vmf->pte, vmf->orig_pte))
>> >goto out;
>> > -  }
>> 
>> The page table holding PTE may be unmapped in pte_unmap_same(), so is it
>> safe for us to access page table after this in do_swap_page_readahead()?
>
> That's why I calls pte_offset_map in swap_ra_info before the access.

Oh, I found it!  Thanks for explanation!

Best Regards,
Huang, Ying

>> 
>> Best Regards,
>> Huang, Ying
>> 
>> >entry = pte_to_swp_entry(vmf->orig_pte);
>> >if (unlikely(non_swap_entry(entry))) {
>> > @@ -2889,9 +2882,7 @@ int do_swap_page(struct vm_fault *vmf)
>> >  
>> >  
>> >delayacct_set_flag(DELAYACCT_PF_SWAPIN);
>> > -  if (!page)
>> > -  page = lookup_swap_cache(entry, vma_readahead ? vma : NULL,
>> > -   vmf->address);
>> > +  page = lookup_swap_cache(entry, vma, vmf->address);
>> >if (!page) {
>> >struct swap_info_struct *si = swp_swap_info(entry);
>> >  
>> > @@ -2907,9 +2898,9 @@ int do_swap_page(struct vm_fault *vmf)
>> >swap_readpage(page, true);
>> >}
>> >} else {
>> > -  if (vma_readahead)
>> > +  if (swap_use_vma_readahead())
>> >page = do_swap_page_readahead(entry,
>> > -  GFP_HIGHUSER_MOVABLE, vmf, &swap_ra);
>> > +  GFP_HIGHUSER_MOVABLE, vmf);
>> >else
>> >page = swapin_readahead(entry,
>> >   GFP_HIGHUSER_MOVABLE, vma, vmf->address);
>> > diff --

linux-next: manual merge of the tip tree with the s390 tree

2017-10-31 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the tip tree got a conflict in:

  arch/s390/lib/spinlock.c

between commit:

  eb3b7b848fb3 ("s390/rwlock: introduce rwlock wait queueing")
(at least)

from the s390 tree and commit:

  6aa7de059173 ("locking/atomics: COCCINELLE/treewide: Convert trivial 
ACCESS_ONCE() patterns to READ_ONCE()/WRITE_ONCE()")

from the tip tree.

I fixed it up (the ACCESS_ONCE instances replaced in the latter were
removed by the former ... there was one more ACCESS_ONCE added, but I
left it in place) and can carry the fix as necessary. This is now fixed
as far as linux-next is concerned, but any non trivial conflicts should
be mentioned to your upstream maintainer when your tree is submitted for
merging.  You may also want to consider cooperating with the maintainer
of the conflicting tree to minimise any particularly complex conflicts.



-- 
Cheers,
Stephen Rothwell


Re: WARNING in task_participate_group_stop

2017-10-31 Thread Dmitry Vyukov
On Tue, Oct 31, 2017 at 7:34 PM, Oleg Nesterov  wrote:
> On 10/30, Dmitry Vyukov wrote:
>>
>> On Mon, Oct 30, 2017 at 10:12 PM, syzbot
>> 
>> wrote:
>> > Hello,
>> >
>> > syzkaller hit the following crash on
>> > d95e159cd1da1ed4dbf76bf203e8ffaf231395e4
>> > git://git.cmpxchg.org/linux-mmots.git/master
>> > compiler: gcc (GCC) 7.1.1 20170620
>> > .config is attached
>> > Raw console output is attached.
>> > C reproducer is attached
>
> Hmm. I do not see reproducer in this email...

Ah, sorry. You can see full thread with attachments here:
https://groups.google.com/forum/#!topic/syzkaller-bugs/EUmYZU4m5gU


>> > syzkaller reproducer is attached. See https://goo.gl/kgGztJ
>> > for information about syzkaller reproducers
>>
>> This also happens on more recent commits, including linux-next
>> 36ef71cae353f88fd6e095e2aaa3e5953af1685d (Oct 19) and upstream
>> 3e0cc09a3a2c40ec1ffb6b4e12da86e98feccb11 (Oct 18).
>>
>> > WARNING: CPU: 0 PID: 1 at kernel/signal.c:340
>> > task_participate_group_stop+0x1ce/0x230 kernel/signal.c:340
>> > Kernel panic - not syncing: panic_on_warn set ...
>> >
>> > CPU: 0 PID: 1 Comm: init Not tainted 4.13.0-mm1+ #5
>
> Looks familiar... I need some time to recall the details, will try to send
> the fix(es) this week.
>
> So this is init process with SIGNAL_UNKILLABLE flag set. And I hope it has
> the pending SIGKILL, otherwise there is something else.
>
> IIRC the problem is that complete_signal(SIGKILL) does nothing if
> SIGNAL_UNKILLABLE is set, in particular it doesn't set SIGNAL_GROUP_EXIT.
> This fools the signal_group_exit() check in do_signal_stop().
>
> Actually there are more problems with SIGNAL_UNKILLABLE && SIGKILL, we need
> some nasty cleanups.
>
> Oleg.
>
>
>> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
>> > Google 01/01/2011
>> > Call Trace:
>> >  __dump_stack lib/dump_stack.c:16 [inline]
>> >  dump_stack+0x194/0x257 lib/dump_stack.c:52
>> >  panic+0x1e4/0x417 kernel/panic.c:181
>> >  __warn+0x1c4/0x1d9 kernel/panic.c:542
>> >  report_bug+0x211/0x2d0 lib/bug.c:183
>> >  fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:178
>> >  do_trap_no_signal arch/x86/kernel/traps.c:212 [inline]
>> >  do_trap+0x260/0x390 arch/x86/kernel/traps.c:261
>> >  do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:298
>> >  do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:311
>> >  invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:905
>> > RIP: 0010:task_participate_group_stop+0x1ce/0x230 kernel/signal.c:340
>> > RSP: 0018:8801d9ee77f0 EFLAGS: 00010097
>> > RAX: 8801d9ed8040 RBX: 8801d9ed8040 RCX: 8801d9edb2c0
>> > RDX:  RSI: 00060013 RDI: 8801d9ed84d0
>> > RBP: 8801d9ee7808 R08: 8801d9ee7180 R09: 8801d9ee7178
>> > R10: 8801d9ee70f0 R11: 11003b3db29b R12: 8801d9ee9740
>> > R13:  R14: dc00 R15: 8801d9ed85c8
>> >  do_signal_stop+0x217/0x900 kernel/signal.c:2042
>> >  get_signal+0x61c/0x17e0 kernel/signal.c:2297
>> >  do_signal+0x94/0x1ee0 arch/x86/kernel/signal.c:808
>> >  exit_to_usermode_loop+0x224/0x300 arch/x86/entry/common.c:158
>> >  prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline]
>> >  syscall_return_slowpath+0x42f/0x500 arch/x86/entry/common.c:266
>> >  entry_SYSCALL_64_fastpath+0xbc/0xbe
>> > RIP: 0033:0x7f33f723fdd3
>> > RSP: 002b:7fffb5303398 EFLAGS: 0246 ORIG_RAX: 0017
>> > RAX: fdfe RBX: 7fffb5303540 RCX: 7f33f723fdd3
>> > RDX:  RSI: 7fffb53036f0 RDI: 000b
>> > RBP: 7fffb53036f0 R08: 7fffb5303770 R09: 0001
>> > R10:  R11: 0246 R12: 
>> > R13: 7fffb5303ad0 R14:  R15: 
>> >
>> >
>> > ---
>> > This bug is generated by a dumb bot. It may contain errors.
>> > See https://goo.gl/tpsmEJ for details.
>> > Direct all questions to syzkal...@googlegroups.com.
>> >
>> > syzbot will keep track of this bug report.
>> > Once a fix for this bug is committed, please reply to this email with:
>> > #syz fix: exact-commit-title
>> > To mark this as a duplicate of another syzbot report, please reply with:
>> > #syz dup: exact-subject-of-another-report
>> > If it's a one-off invalid bug report, please reply with:
>> > #syz invalid
>> > Note: if the crash happens again, it will cause creation of a new bug
>> > report.
>> >
>> > --
>> > You received this message because you are subscribed to the Google Groups
>> > "syzkaller-bugs" group.
>> > To unsubscribe from this group and stop receiving emails from it, send an
>> > email to syzkaller-bugs+unsubscr...@googlegroups.com.
>> > To view this discussion on the web visit
>> > https://groups.google.com/d/msgid/syzkaller-bugs/94eb2c058c80ea49ed055cc8695e%40google.com.
>> > For more options, visit https://groups.google.com/d/optout.
>


[PATCH 2/2] perf record: Replace 'overwrite' by 'flightrecorder' for better naming

2017-10-31 Thread Wang Nan
The meaning of perf record's "overwrite" option and many "overwrite" in
source code are not clear. In perf's code, the 'overwrite' has 2 meanings:
 1. Make ringbuffer readonly (perf_evlist__mmap_ex's argument).
 2. Set evsel's "backward" attribute (in apply_config_terms).

perf record doesn't use meaning 1 at all, but have a overwrite option, its
real meaning is setting backward.

This patch separates these two concepts, introduce 'flightrecorder' mode
which is what we really want. It combines these 2 concept together, wraps
them into a record mode. In flight recorder mode, perf only dumps data before
something happen.

Signed-off-by: Wang Nan 
---
 tools/perf/Documentation/perf-record.txt |  8 
 tools/perf/builtin-record.c  |  4 ++--
 tools/perf/perf.h|  2 +-
 tools/perf/util/evsel.c  |  6 +++---
 tools/perf/util/evsel.h  |  4 ++--
 tools/perf/util/parse-events.c   | 20 ++--
 tools/perf/util/parse-events.h   |  4 ++--
 tools/perf/util/parse-events.l   |  4 ++--
 8 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt 
b/tools/perf/Documentation/perf-record.txt
index 5a626ef..463c2d3 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -467,19 +467,19 @@ the beginning of record, collect them during finalizing 
an output file.
 The collected non-sample events reflects the status of the system when
 record is finished.
 
---overwrite::
+--flight-recorder::
 Makes all events use an overwritable ring buffer. An overwritable ring
 buffer works like a flight recorder: when it gets full, the kernel will
 overwrite the oldest records, that thus will never make it to the
 perf.data file.
 
-When '--overwrite' and '--switch-output' are used perf records and drops
+When '--flight-recorder' and '--switch-output' are used perf records and drops
 events until it receives a signal, meaning that something unusual was
 detected that warrants taking a snapshot of the most current events,
 those fitting in the ring buffer at that moment.
 
-'overwrite' attribute can also be set or canceled for an event using
-config terms. For example: 'cycles/overwrite/' and 
'instructions/no-overwrite/'.
+'flightrecorder' attribute can also be set or canceled separately for an event 
using
+config terms. For example: 'cycles/flightrecorder/' and 
'instructions/no-flightrecorder/'.
 
 Implies --tail-synthesize.
 
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index f4d9fc5..315ea09 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1489,7 +1489,7 @@ static struct option __record_options[] = {
"child tasks do not inherit counters"),
OPT_BOOLEAN(0, "tail-synthesize", &record.opts.tail_synthesize,
"synthesize non-sample events at the end of output"),
-   OPT_BOOLEAN(0, "overwrite", &record.opts.overwrite, "use overwrite 
mode"),
+   OPT_BOOLEAN(0, "flight-recoder", &record.opts.flight_recorder, "use 
flight recoder mode"),
OPT_UINTEGER('F', "freq", &record.opts.user_freq, "profile at this 
frequency"),
OPT_CALLBACK('m', "mmap-pages", &record.opts, "pages[,pages]",
 "number of mmap data pages and AUX area tracing mmap 
pages",
@@ -1733,7 +1733,7 @@ int cmd_record(int argc, const char **argv)
}
}
 
-   if (record.opts.overwrite)
+   if (record.opts.flight_recorder)
record.opts.tail_synthesize = true;
 
if (rec->evlist->nr_entries == 0 &&
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index fbb0a9c..a7f7618 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -57,7 +57,7 @@ struct record_opts {
bool all_kernel;
bool all_user;
bool tail_synthesize;
-   bool overwrite;
+   bool flight_recorder;
bool ignore_missing_thread;
unsigned int freq;
unsigned int mmap_pages;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index f894893..0e1e8e8 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -772,8 +772,8 @@ static void apply_config_terms(struct perf_evsel *evsel,
 */
attr->inherit = term->val.inherit ? 1 : 0;
break;
-   case PERF_EVSEL__CONFIG_TERM_OVERWRITE:
-   attr->write_backward = term->val.overwrite ? 1 : 0;
+   case PERF_EVSEL__CONFIG_TERM_FLIGHTRECORDER:
+   attr->write_backward = term->val.flightrecorder ? 1 : 0;
break;
default:
break;
@@ -856,7 +856,7 @@ void perf_evsel__config(struct perf_evsel *evsel, struct 
record_opts *opts,
 
attr->sample_id_all = perf_missing_features.sample_i

[PATCH 0/2] perf record: Fix --overwrite and clarify concepts

2017-10-31 Thread Wang Nan
Kan reports that 'perf record --overwrite' not working as it should be.

Patch 1/2 fix a bug, map backward events to readonly ring buffer so kernel
can overwrite that ring buffer.

Patch 2/2 clarify concepts of 'overwrite' and 'backward' in the source code
by introducing the concept of 'flightrecorder' and convert many 'overwrite'
to it to clarify that what we really want is a perf record flightrecorder
mode, not only mapping the ring buffer overwritable.

Wang Nan (2):
  perf mmap: Fix perf backward recording
  perf record: Replace 'overwrite' by 'flightrecorder' for better naming

 tools/perf/Documentation/perf-record.txt |  8 
 tools/perf/builtin-record.c  |  4 ++--
 tools/perf/perf.h|  2 +-
 tools/perf/util/evlist.c |  8 +++-
 tools/perf/util/evsel.c  |  6 +++---
 tools/perf/util/evsel.h  |  4 ++--
 tools/perf/util/parse-events.c   | 20 ++--
 tools/perf/util/parse-events.h   |  4 ++--
 tools/perf/util/parse-events.l   |  4 ++--
 9 files changed, 33 insertions(+), 27 deletions(-)

-- 
2.10.1



[PATCH 1/2] perf mmap: Fix perf backward recording

2017-10-31 Thread Wang Nan
perf record backward recording doesn't work as we expected: it never
overwrite when ring buffer full.

Test:

(Run a busy printing python task background like this:

 while True:
 print 123

send SIGUSR2 to perf to capture snapshot.)

 # ./perf record --overwrite -e raw_syscalls:sys_enter -e raw_syscalls:sys_exit 
--exclude-perf -a --switch-output
 [ perf record: dump data: Woken up 1 times ]
 [ perf record: Dump perf.data.2017110101520743 ]
 [ perf record: dump data: Woken up 1 times ]
 [ perf record: Dump perf.data.2017110101521251 ]
 [ perf record: dump data: Woken up 1 times ]
 [ perf record: Dump perf.data.2017110101521692 ]
 ^C[ perf record: Woken up 1 times to write data ]
 [ perf record: Dump perf.data.2017110101521936 ]
 [ perf record: Captured and wrote 0.826 MB perf.data. ]

 # ./perf script -i ./perf.data.2017110101520743 | head -n3
 perf  2717 [000] 12449.310785: raw_syscalls:sys_enter: NR 16 (5, 
2400, 0, 59, 100, 0)
 perf  2717 [000] 12449.310790: raw_syscalls:sys_enter: NR 7 
(4112340, 2, , 3df, 100, 0)
   python  2545 [000] 12449.310800:  raw_syscalls:sys_exit: NR 1 = 4
 # ./perf script -i ./perf.data.2017110101521251 | head -n3
 perf  2717 [000] 12449.310785: raw_syscalls:sys_enter: NR 16 (5, 
2400, 0, 59, 100, 0)
 perf  2717 [000] 12449.310790: raw_syscalls:sys_enter: NR 7 
(4112340, 2, , 3df, 100, 0)
   python  2545 [000] 12449.310800:  raw_syscalls:sys_exit: NR 1 = 4
 # ./perf script -i ./perf.data.2017110101521692 | head -n3
 perf  2717 [000] 12449.310785: raw_syscalls:sys_enter: NR 16 (5, 
2400, 0, 59, 100, 0)
 perf  2717 [000] 12449.310790: raw_syscalls:sys_enter: NR 7 
(4112340, 2, , 3df, 100, 0)
   python  2545 [000] 12449.310800:  raw_syscalls:sys_exit: NR 1 = 4

Timestamps are never change, but my background task is a dead loop, can
easily overwhelme the ring buffer.

This patch fix it by force unsetting PROT_WRITE for backward ring
buffer, so all backward ring buffer become overwrite ring buffer.

Test result:

 # ./perf record --overwrite -e raw_syscalls:sys_enter -e raw_syscalls:sys_exit 
--exclude-perf -a --switch-output
 [ perf record: dump data: Woken up 1 times ]
 [ perf record: Dump perf.data.2017110101285323 ]
 [ perf record: dump data: Woken up 1 times ]
 [ perf record: Dump perf.data.2017110101290053 ]
 [ perf record: dump data: Woken up 1 times ]
 [ perf record: Dump perf.data.2017110101290446 ]
 ^C[ perf record: Woken up 1 times to write data ]
 [ perf record: Dump perf.data.2017110101290837 ]
 [ perf record: Captured and wrote 0.826 MB perf.data. ]
 # ./perf script -i ./perf.data.2017110101285323 | head -n3
   python  2545 [000] 11064.268083:  raw_syscalls:sys_exit: NR 1 = 4
   python  2545 [000] 11064.268084: raw_syscalls:sys_enter: NR 1 (1, 
12cc330, 4, 7fc237280370, 7fc2373d0700, 2c7b0)
   python  2545 [000] 11064.268086:  raw_syscalls:sys_exit: NR 1 = 4
 # ./perf script -i ./perf.data.2017110101290 | head -n3
 failed to open ./perf.data.2017110101290: No such file or directory
 # ./perf script -i ./perf.data.2017110101290053 | head -n3
   python  2545 [000] 11071.564062: raw_syscalls:sys_enter: NR 1 (1, 
12cc330, 4, 7fc237280370, 7fc2373d0700, 2c7b0)
   python  2545 [000] 11071.564064:  raw_syscalls:sys_exit: NR 1 = 4
   python  2545 [000] 11071.564066: raw_syscalls:sys_enter: NR 1 (1, 
12cc330, 4, 7fc237280370, 7fc2373d0700, 2c7b0)
 # ./perf script -i ./perf.data.2017110101290 | head -n3
 perf.data.2017110101290053  perf.data.2017110101290446  
perf.data.2017110101290837
 # ./perf script -i ./perf.data.2017110101290446 | head -n3
 sshd  1321 [000] 11075.499473:  raw_syscalls:sys_exit: NR 14 = 0
 sshd  1321 [000] 11075.499474: raw_syscalls:sys_enter: NR 14 (2, 
7ffe98899490, 0, 8, 0, 3000)
 sshd  1321 [000] 11075.499474:  raw_syscalls:sys_exit: NR 14 = 0
 # ./perf script -i ./perf.data.2017110101290837 | head -n3
   python  2545 [000] 11079.280844:  raw_syscalls:sys_exit: NR 1 = 4
   python  2545 [000] 11079.280847: raw_syscalls:sys_enter: NR 1 (1, 
12cc330, 4, 7fc237280370, 7fc2373d0700, 2c7b0)
   python  2545 [000] 11079.280850:  raw_syscalls:sys_exit: NR 1 = 4

Signed-off-by: Wang Nan 
---
 tools/perf/util/evlist.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index c6c891e..4c5daba 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -799,22 +799,28 @@ perf_evlist__should_poll(struct perf_evlist *evlist 
__maybe_unused,
 }
 
 static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
-  struct mmap_params *mp, int cpu_idx,
+  struct mmap_params *_mp, int cpu_idx,
   int thread, int *_output, int 
*_output_bac

linux-next: manual merge of the tip tree with the powerpc tree

2017-10-31 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the tip tree got a conflict in:

  arch/powerpc/mm/numa.c

between commit:

  cee5405da402 ("powerpc/hotplug: Improve responsiveness of hotplug change")

from the powerpc tree and commit:

  df7e828c1b69 ("timer: Remove init_timer_deferrable() in favor of 
timer_setup()")

from the tip tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc arch/powerpc/mm/numa.c
index eb604b3574fa,73016451f330..
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@@ -1506,9 -1466,7 +1505,7 @@@ static struct timer_list topology_timer
  
  static void reset_topology_timer(void)
  {
-   topology_timer.data = 0;
-   topology_timer.expires = jiffies + topology_timer_secs * HZ;
-   mod_timer(&topology_timer, topology_timer.expires);
 -  mod_timer(&topology_timer, jiffies + 60 * HZ);
++  mod_timer(&topology_timer, jiffies + topology_timer_secs * HZ);
  }
  
  #ifdef CONFIG_SMP
@@@ -1561,13 -1520,14 +1558,14 @@@ int start_topology_update(void
rc = of_reconfig_notifier_register(&dt_update_nb);
  #endif
}
 -  } else if (firmware_has_feature(FW_FEATURE_VPHN) &&
 +  }
 +  if (firmware_has_feature(FW_FEATURE_VPHN) &&
   lppaca_shared_proc(get_lppaca())) {
if (!vphn_enabled) {
 -  prrn_enabled = 0;
vphn_enabled = 1;
setup_cpu_associativity_change_counters();
-   init_timer_deferrable(&topology_timer);
+   timer_setup(&topology_timer, topology_timer_fn,
+   TIMER_DEFERRABLE);
reset_topology_timer();
}
}


linux-next: manual merge of the tip tree with the arm64 tree

2017-10-31 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the tip tree got a conflict in:

  arch/arm64/Kconfig

between commit:

  396a5d4a5c32 ("arm64: Unconditionally support {ARCH_}HAVE_NMI{_SAFE_CMPXCHG}")

from the arm64 tree and commit:

  087133ac9076 ("locking/qrwlock, arm64: Move rwlock implementation over to 
qrwlocks")

from the tip tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc arch/arm64/Kconfig
index 38f8d26208af,6205f521b648..
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@@ -21,8 -21,25 +21,25 @@@ config ARM6
select ARCH_HAS_STRICT_KERNEL_RWX
select ARCH_HAS_STRICT_MODULE_RWX
select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
 -  select ARCH_HAVE_NMI_SAFE_CMPXCHG if ACPI_APEI_SEA
 +  select ARCH_HAVE_NMI_SAFE_CMPXCHG
+   select ARCH_INLINE_READ_LOCK if !PREEMPT
+   select ARCH_INLINE_READ_LOCK_BH if !PREEMPT
+   select ARCH_INLINE_READ_LOCK_IRQ if !PREEMPT
+   select ARCH_INLINE_READ_LOCK_IRQSAVE if !PREEMPT
+   select ARCH_INLINE_READ_UNLOCK if !PREEMPT
+   select ARCH_INLINE_READ_UNLOCK_BH if !PREEMPT
+   select ARCH_INLINE_READ_UNLOCK_IRQ if !PREEMPT
+   select ARCH_INLINE_READ_UNLOCK_IRQRESTORE if !PREEMPT
+   select ARCH_INLINE_WRITE_LOCK if !PREEMPT
+   select ARCH_INLINE_WRITE_LOCK_BH if !PREEMPT
+   select ARCH_INLINE_WRITE_LOCK_IRQ if !PREEMPT
+   select ARCH_INLINE_WRITE_LOCK_IRQSAVE if !PREEMPT
+   select ARCH_INLINE_WRITE_UNLOCK if !PREEMPT
+   select ARCH_INLINE_WRITE_UNLOCK_BH if !PREEMPT
+   select ARCH_INLINE_WRITE_UNLOCK_IRQ if !PREEMPT
+   select ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE if !PREEMPT
select ARCH_USE_CMPXCHG_LOCKREF
+   select ARCH_USE_QUEUED_RWLOCKS
select ARCH_SUPPORTS_MEMORY_FAILURE
select ARCH_SUPPORTS_ATOMIC_RMW
select ARCH_SUPPORTS_NUMA_BALANCING


Re: [PATCH 1/2] mm:swap: clean up swap readahead

2017-10-31 Thread Minchan Kim
Hi Huang,

On Wed, Nov 01, 2017 at 01:41:00PM +0800, Huang, Ying wrote:
> Hi, Minchan,
> 
> Minchan Kim  writes:
> 
> > When I see recent change of swap readahead, I am very unhappy
> > about current code structure which diverges two swap readahead
> > algorithm in do_swap_page. This patch is to clean it up.
> >
> > Main motivation is that fault handler doesn't need to be aware of
> > readahead algorithms but just should call swapin_readahead.
> >
> > As first step, this patch cleans up a little bit but not perfect
> > (I just separate for review easier) so next patch will make the goal
> > complete.
> >
> > Signed-off-by: Minchan Kim 
> > ---
> >  include/linux/swap.h | 17 ++
> >  mm/memory.c  | 17 +++---
> >  mm/swap_state.c  | 89 
> > 
> >  3 files changed, 55 insertions(+), 68 deletions(-)
> >
> > diff --git a/include/linux/swap.h b/include/linux/swap.h
> > index 84255b3da7c1..7c7c8b344bc9 100644
> > --- a/include/linux/swap.h
> > +++ b/include/linux/swap.h
> > @@ -427,12 +427,8 @@ extern struct page 
> > *__read_swap_cache_async(swp_entry_t, gfp_t,
> > bool *new_page_allocated);
> >  extern struct page *swapin_readahead(swp_entry_t, gfp_t,
> > struct vm_area_struct *vma, unsigned long addr);
> > -
> > -extern struct page *swap_readahead_detect(struct vm_fault *vmf,
> > - struct vma_swap_readahead *swap_ra);
> >  extern struct page *do_swap_page_readahead(swp_entry_t fentry, gfp_t 
> > gfp_mask,
> > -  struct vm_fault *vmf,
> > -  struct vma_swap_readahead *swap_ra);
> > +  struct vm_fault *vmf);
> >  
> >  /* linux/mm/swapfile.c */
> >  extern atomic_long_t nr_swap_pages;
> > @@ -551,15 +547,8 @@ static inline bool swap_use_vma_readahead(void)
> > return false;
> >  }
> >  
> > -static inline struct page *swap_readahead_detect(
> > -   struct vm_fault *vmf, struct vma_swap_readahead *swap_ra)
> > -{
> > -   return NULL;
> > -}
> > -
> > -static inline struct page *do_swap_page_readahead(
> > -   swp_entry_t fentry, gfp_t gfp_mask,
> > -   struct vm_fault *vmf, struct vma_swap_readahead *swap_ra)
> > +static inline struct page *do_swap_page_readahead(swp_entry_t fentry,
> > +   gfp_t gfp_mask, struct vm_fault *vmf)
> >  {
> > return NULL;
> >  }
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 8a0c410037d2..e955298e4290 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -2849,21 +2849,14 @@ int do_swap_page(struct vm_fault *vmf)
> > struct vm_area_struct *vma = vmf->vma;
> > struct page *page = NULL, *swapcache = NULL;
> > struct mem_cgroup *memcg;
> > -   struct vma_swap_readahead swap_ra;
> > swp_entry_t entry;
> > pte_t pte;
> > int locked;
> > int exclusive = 0;
> > int ret = 0;
> > -   bool vma_readahead = swap_use_vma_readahead();
> >  
> > -   if (vma_readahead)
> > -   page = swap_readahead_detect(vmf, &swap_ra);
> > -   if (!pte_unmap_same(vma->vm_mm, vmf->pmd, vmf->pte, vmf->orig_pte)) {
> > -   if (page)
> > -   put_page(page);
> > +   if (!pte_unmap_same(vma->vm_mm, vmf->pmd, vmf->pte, vmf->orig_pte))
> > goto out;
> > -   }
> 
> The page table holding PTE may be unmapped in pte_unmap_same(), so is it
> safe for us to access page table after this in do_swap_page_readahead()?

That's why I calls pte_offset_map in swap_ra_info before the access.

> 
> Best Regards,
> Huang, Ying
> 
> > entry = pte_to_swp_entry(vmf->orig_pte);
> > if (unlikely(non_swap_entry(entry))) {
> > @@ -2889,9 +2882,7 @@ int do_swap_page(struct vm_fault *vmf)
> >  
> >  
> > delayacct_set_flag(DELAYACCT_PF_SWAPIN);
> > -   if (!page)
> > -   page = lookup_swap_cache(entry, vma_readahead ? vma : NULL,
> > -vmf->address);
> > +   page = lookup_swap_cache(entry, vma, vmf->address);
> > if (!page) {
> > struct swap_info_struct *si = swp_swap_info(entry);
> >  
> > @@ -2907,9 +2898,9 @@ int do_swap_page(struct vm_fault *vmf)
> > swap_readpage(page, true);
> > }
> > } else {
> > -   if (vma_readahead)
> > +   if (swap_use_vma_readahead())
> > page = do_swap_page_readahead(entry,
> > -   GFP_HIGHUSER_MOVABLE, vmf, &swap_ra);
> > +   GFP_HIGHUSER_MOVABLE, vmf);
> > else
> > page = swapin_readahead(entry,
> >GFP_HIGHUSER_MOVABLE, vma, vmf->address);
> > diff --git a/mm/swap_state.c b/mm/swap_state.c
> > index 6c017ced11e6..e3c535fcd2df 100644
> > --- a/mm/swap_state.c
> > +++ b/mm/swap_state.c
> > @@ -331,32 +331,38 @@ stru

Re: [PATCH 1/2] mm:swap: clean up swap readahead

2017-10-31 Thread Huang, Ying
Hi, Minchan,

Minchan Kim  writes:

> When I see recent change of swap readahead, I am very unhappy
> about current code structure which diverges two swap readahead
> algorithm in do_swap_page. This patch is to clean it up.
>
> Main motivation is that fault handler doesn't need to be aware of
> readahead algorithms but just should call swapin_readahead.
>
> As first step, this patch cleans up a little bit but not perfect
> (I just separate for review easier) so next patch will make the goal
> complete.
>
> Signed-off-by: Minchan Kim 
> ---
>  include/linux/swap.h | 17 ++
>  mm/memory.c  | 17 +++---
>  mm/swap_state.c  | 89 
> 
>  3 files changed, 55 insertions(+), 68 deletions(-)
>
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index 84255b3da7c1..7c7c8b344bc9 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -427,12 +427,8 @@ extern struct page *__read_swap_cache_async(swp_entry_t, 
> gfp_t,
>   bool *new_page_allocated);
>  extern struct page *swapin_readahead(swp_entry_t, gfp_t,
>   struct vm_area_struct *vma, unsigned long addr);
> -
> -extern struct page *swap_readahead_detect(struct vm_fault *vmf,
> -   struct vma_swap_readahead *swap_ra);
>  extern struct page *do_swap_page_readahead(swp_entry_t fentry, gfp_t 
> gfp_mask,
> -struct vm_fault *vmf,
> -struct vma_swap_readahead *swap_ra);
> +struct vm_fault *vmf);
>  
>  /* linux/mm/swapfile.c */
>  extern atomic_long_t nr_swap_pages;
> @@ -551,15 +547,8 @@ static inline bool swap_use_vma_readahead(void)
>   return false;
>  }
>  
> -static inline struct page *swap_readahead_detect(
> - struct vm_fault *vmf, struct vma_swap_readahead *swap_ra)
> -{
> - return NULL;
> -}
> -
> -static inline struct page *do_swap_page_readahead(
> - swp_entry_t fentry, gfp_t gfp_mask,
> - struct vm_fault *vmf, struct vma_swap_readahead *swap_ra)
> +static inline struct page *do_swap_page_readahead(swp_entry_t fentry,
> + gfp_t gfp_mask, struct vm_fault *vmf)
>  {
>   return NULL;
>  }
> diff --git a/mm/memory.c b/mm/memory.c
> index 8a0c410037d2..e955298e4290 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2849,21 +2849,14 @@ int do_swap_page(struct vm_fault *vmf)
>   struct vm_area_struct *vma = vmf->vma;
>   struct page *page = NULL, *swapcache = NULL;
>   struct mem_cgroup *memcg;
> - struct vma_swap_readahead swap_ra;
>   swp_entry_t entry;
>   pte_t pte;
>   int locked;
>   int exclusive = 0;
>   int ret = 0;
> - bool vma_readahead = swap_use_vma_readahead();
>  
> - if (vma_readahead)
> - page = swap_readahead_detect(vmf, &swap_ra);
> - if (!pte_unmap_same(vma->vm_mm, vmf->pmd, vmf->pte, vmf->orig_pte)) {
> - if (page)
> - put_page(page);
> + if (!pte_unmap_same(vma->vm_mm, vmf->pmd, vmf->pte, vmf->orig_pte))
>   goto out;
> - }

The page table holding PTE may be unmapped in pte_unmap_same(), so is it
safe for us to access page table after this in do_swap_page_readahead()?

Best Regards,
Huang, Ying

>   entry = pte_to_swp_entry(vmf->orig_pte);
>   if (unlikely(non_swap_entry(entry))) {
> @@ -2889,9 +2882,7 @@ int do_swap_page(struct vm_fault *vmf)
>  
>  
>   delayacct_set_flag(DELAYACCT_PF_SWAPIN);
> - if (!page)
> - page = lookup_swap_cache(entry, vma_readahead ? vma : NULL,
> -  vmf->address);
> + page = lookup_swap_cache(entry, vma, vmf->address);
>   if (!page) {
>   struct swap_info_struct *si = swp_swap_info(entry);
>  
> @@ -2907,9 +2898,9 @@ int do_swap_page(struct vm_fault *vmf)
>   swap_readpage(page, true);
>   }
>   } else {
> - if (vma_readahead)
> + if (swap_use_vma_readahead())
>   page = do_swap_page_readahead(entry,
> - GFP_HIGHUSER_MOVABLE, vmf, &swap_ra);
> + GFP_HIGHUSER_MOVABLE, vmf);
>   else
>   page = swapin_readahead(entry,
>  GFP_HIGHUSER_MOVABLE, vma, vmf->address);
> diff --git a/mm/swap_state.c b/mm/swap_state.c
> index 6c017ced11e6..e3c535fcd2df 100644
> --- a/mm/swap_state.c
> +++ b/mm/swap_state.c
> @@ -331,32 +331,38 @@ struct page *lookup_swap_cache(swp_entry_t entry, 
> struct vm_area_struct *vma,
>  unsigned long addr)
>  {
>   struct page *page;
> - unsigned long ra_info;
> - int win, hits, readahead;
>  
>   page = find_get_page(swap_address_space(entry), 

[RFC] EPOLL_KILLME: New flag to epoll_wait() that subscribes process to death row (new syscall)

2017-10-31 Thread Shawn Landden
It is common for services to be stateless around their main event loop.
If a process passes the EPOLL_KILLME flag to epoll_wait5() then it
signals to the kernel that epoll_wait5() may not complete, and the kernel
may send SIGKILL if resources get tight.

See my systemd patch: https://github.com/shawnl/systemd/tree/killme

Android uses this memory model for all programs, and having it in the
kernel will enable integration with the page cache (not in this
series).
---
 arch/x86/entry/syscalls/syscall_32.tbl |  1 +
 arch/x86/entry/syscalls/syscall_64.tbl |  1 +
 fs/eventpoll.c | 74 +-
 include/linux/eventpoll.h  |  2 +
 include/linux/sched.h  |  3 ++
 include/uapi/asm-generic/unistd.h  |  5 ++-
 include/uapi/linux/eventpoll.h |  3 ++
 kernel/exit.c  |  2 +
 mm/oom_kill.c  | 17 
 9 files changed, 105 insertions(+), 3 deletions(-)

diff --git a/arch/x86/entry/syscalls/syscall_32.tbl 
b/arch/x86/entry/syscalls/syscall_32.tbl
index 448ac2161112..040e5d02bdcc 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -391,3 +391,4 @@
 382i386pkey_free   sys_pkey_free
 383i386statx   sys_statx
 384i386arch_prctl  sys_arch_prctl  
compat_sys_arch_prctl
+385i386epoll_wait5 sys_epoll_wait5
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl 
b/arch/x86/entry/syscalls/syscall_64.tbl
index 5aef183e2f85..c72802e8cf65 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -339,6 +339,7 @@
 330common  pkey_alloc  sys_pkey_alloc
 331common  pkey_free   sys_pkey_free
 332common  statx   sys_statx
+333common  epoll_wait5 sys_epoll_wait5
 
 #
 # x32-specific system call numbers start at 512 to avoid cache impact
diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 2fabd19cdeea..76d1c91d940b 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -297,6 +297,14 @@ static LIST_HEAD(visited_list);
  */
 static LIST_HEAD(tfile_check_list);
 
+static LIST_HEAD(deathrow_q);
+static long deathrow_len __read_mostly;
+
+/* TODO: Can this lock be removed by using atomic instructions to update
+ * queue?
+ */
+static DEFINE_MUTEX(deathrow_mutex);
+
 #ifdef CONFIG_SYSCTL
 
 #include 
@@ -314,6 +322,15 @@ struct ctl_table epoll_table[] = {
.extra1 = &zero,
.extra2 = &long_max,
},
+   {
+   .procname   = "deathrow_size",
+   .data   = &deathrow_len,
+   .maxlen = sizeof(deathrow_len),
+   .mode   = 0444,
+   .proc_handler   = proc_doulongvec_minmax,
+   .extra1 = &zero,
+   .extra2 = &long_max,
+   },
{ }
 };
 #endif /* CONFIG_SYSCTL */
@@ -2164,9 +2181,12 @@ SYSCALL_DEFINE4(epoll_ctl, int, epfd, int, op, int, fd,
 /*
  * Implement the event wait interface for the eventpoll file. It is the kernel
  * part of the user space epoll_wait(2).
+ *
+ * A flags argument cannot be added to epoll_pwait cause it already has
+ * the maximum number of arguments (6). Can this be fixed?
  */
-SYSCALL_DEFINE4(epoll_wait, int, epfd, struct epoll_event __user *, events,
-   int, maxevents, int, timeout)
+SYSCALL_DEFINE5(epoll_wait5, int, epfd, struct epoll_event __user *, events,
+   int, maxevents, int, timeout, int, flags)
 {
int error;
struct fd f;
@@ -2199,14 +2219,44 @@ SYSCALL_DEFINE4(epoll_wait, int, epfd, struct 
epoll_event __user *, events,
 */
ep = f.file->private_data;
 
+   /* Check the EPOLL_* constants for conflicts.  */
+   BUILD_BUG_ON(EPOLL_KILLME == EPOLL_CLOEXEC);
+
+   if (flags & ~EPOLL_KILLME)
+   return -EINVAL;
+
+   if (flags & EPOLL_KILLME) {
+   /* Put process on death row. */
+   mutex_lock(&deathrow_mutex);
+   deathrow_len++;
+   list_add(¤t->se.deathrow, &deathrow_q);
+   current->se.on_deathrow = 1;
+   mutex_unlock(&deathrow_mutex);
+   }
+
/* Time to fish for events ... */
error = ep_poll(ep, events, maxevents, timeout);
 
+   if (flags & EPOLL_KILLME) {
+   /* Remove process from death row. */
+   mutex_lock(&deathrow_mutex);
+   current->se.on_deathrow = 0;
+   list_del(¤t->se.deathrow);
+   deathrow_len--;
+   mutex_unlock(&deathrow_mutex);
+   }
+
 error_fput:
fdput(f);
return error;
 }
 
+SYSCALL_DEFINE4(epoll_wait, int, epfd, struct epoll_event __user *, events,
+   int, maxevents, int, timeout)
+{
+   return sys_epoll_wait5(epfd, events, maxevent

[PATCH 0/2] swap readahead clean up

2017-10-31 Thread Minchan Kim
This patchset cleans up recent added vma-based readahead code via
unifying cluster-based readahead.

Minchan Kim (2):
  mm:swap: clean up swap readahead
  mm:swap: unify cluster-based and vma-based swap readahead

 include/linux/swap.h |  32 +++
 mm/memory.c  |  24 +++
 mm/shmem.c   |   5 ++-
 mm/swap_state.c  | 110 +--
 4 files changed, 87 insertions(+), 84 deletions(-)

-- 
2.7.4



[PATCH 1/2] mm:swap: clean up swap readahead

2017-10-31 Thread Minchan Kim
When I see recent change of swap readahead, I am very unhappy
about current code structure which diverges two swap readahead
algorithm in do_swap_page. This patch is to clean it up.

Main motivation is that fault handler doesn't need to be aware of
readahead algorithms but just should call swapin_readahead.

As first step, this patch cleans up a little bit but not perfect
(I just separate for review easier) so next patch will make the goal
complete.

Signed-off-by: Minchan Kim 
---
 include/linux/swap.h | 17 ++
 mm/memory.c  | 17 +++---
 mm/swap_state.c  | 89 
 3 files changed, 55 insertions(+), 68 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 84255b3da7c1..7c7c8b344bc9 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -427,12 +427,8 @@ extern struct page *__read_swap_cache_async(swp_entry_t, 
gfp_t,
bool *new_page_allocated);
 extern struct page *swapin_readahead(swp_entry_t, gfp_t,
struct vm_area_struct *vma, unsigned long addr);
-
-extern struct page *swap_readahead_detect(struct vm_fault *vmf,
- struct vma_swap_readahead *swap_ra);
 extern struct page *do_swap_page_readahead(swp_entry_t fentry, gfp_t gfp_mask,
-  struct vm_fault *vmf,
-  struct vma_swap_readahead *swap_ra);
+  struct vm_fault *vmf);
 
 /* linux/mm/swapfile.c */
 extern atomic_long_t nr_swap_pages;
@@ -551,15 +547,8 @@ static inline bool swap_use_vma_readahead(void)
return false;
 }
 
-static inline struct page *swap_readahead_detect(
-   struct vm_fault *vmf, struct vma_swap_readahead *swap_ra)
-{
-   return NULL;
-}
-
-static inline struct page *do_swap_page_readahead(
-   swp_entry_t fentry, gfp_t gfp_mask,
-   struct vm_fault *vmf, struct vma_swap_readahead *swap_ra)
+static inline struct page *do_swap_page_readahead(swp_entry_t fentry,
+   gfp_t gfp_mask, struct vm_fault *vmf)
 {
return NULL;
 }
diff --git a/mm/memory.c b/mm/memory.c
index 8a0c410037d2..e955298e4290 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2849,21 +2849,14 @@ int do_swap_page(struct vm_fault *vmf)
struct vm_area_struct *vma = vmf->vma;
struct page *page = NULL, *swapcache = NULL;
struct mem_cgroup *memcg;
-   struct vma_swap_readahead swap_ra;
swp_entry_t entry;
pte_t pte;
int locked;
int exclusive = 0;
int ret = 0;
-   bool vma_readahead = swap_use_vma_readahead();
 
-   if (vma_readahead)
-   page = swap_readahead_detect(vmf, &swap_ra);
-   if (!pte_unmap_same(vma->vm_mm, vmf->pmd, vmf->pte, vmf->orig_pte)) {
-   if (page)
-   put_page(page);
+   if (!pte_unmap_same(vma->vm_mm, vmf->pmd, vmf->pte, vmf->orig_pte))
goto out;
-   }
 
entry = pte_to_swp_entry(vmf->orig_pte);
if (unlikely(non_swap_entry(entry))) {
@@ -2889,9 +2882,7 @@ int do_swap_page(struct vm_fault *vmf)
 
 
delayacct_set_flag(DELAYACCT_PF_SWAPIN);
-   if (!page)
-   page = lookup_swap_cache(entry, vma_readahead ? vma : NULL,
-vmf->address);
+   page = lookup_swap_cache(entry, vma, vmf->address);
if (!page) {
struct swap_info_struct *si = swp_swap_info(entry);
 
@@ -2907,9 +2898,9 @@ int do_swap_page(struct vm_fault *vmf)
swap_readpage(page, true);
}
} else {
-   if (vma_readahead)
+   if (swap_use_vma_readahead())
page = do_swap_page_readahead(entry,
-   GFP_HIGHUSER_MOVABLE, vmf, &swap_ra);
+   GFP_HIGHUSER_MOVABLE, vmf);
else
page = swapin_readahead(entry,
   GFP_HIGHUSER_MOVABLE, vma, vmf->address);
diff --git a/mm/swap_state.c b/mm/swap_state.c
index 6c017ced11e6..e3c535fcd2df 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -331,32 +331,38 @@ struct page *lookup_swap_cache(swp_entry_t entry, struct 
vm_area_struct *vma,
   unsigned long addr)
 {
struct page *page;
-   unsigned long ra_info;
-   int win, hits, readahead;
 
page = find_get_page(swap_address_space(entry), swp_offset(entry));
 
INC_CACHE_INFO(find_total);
if (page) {
+   bool vma_ra = swap_use_vma_readahead();
+   bool readahead = TestClearPageReadahead(page);
+
INC_CACHE_INFO(find_success);
if (unlikely(PageTransCompound(page)))
return pa

[PATCH 2/2] mm:swap: unify cluster-based and vma-based swap readahead

2017-10-31 Thread Minchan Kim
This patch makes do_swap_page no need to be aware of two different
swap readahead algorithm. Just unify cluster-based and vma-based
readahead function call.

Signed-off-by: Minchan Kim 
---
 include/linux/swap.h | 17 -
 mm/memory.c  | 11 ---
 mm/shmem.c   |  5 -
 mm/swap_state.c  | 21 +++--
 4 files changed, 35 insertions(+), 19 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 7c7c8b344bc9..9cc330360eac 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -425,9 +425,11 @@ extern struct page *read_swap_cache_async(swp_entry_t, 
gfp_t,
 extern struct page *__read_swap_cache_async(swp_entry_t, gfp_t,
struct vm_area_struct *vma, unsigned long addr,
bool *new_page_allocated);
-extern struct page *swapin_readahead(swp_entry_t, gfp_t,
-   struct vm_area_struct *vma, unsigned long addr);
-extern struct page *do_swap_page_readahead(swp_entry_t fentry, gfp_t gfp_mask,
+extern struct page *cluster_readahead(swp_entry_t entry, gfp_t flag,
+   struct vm_fault *vmf);
+extern struct page *swapin_readahead(swp_entry_t entry, gfp_t flag,
+   struct vm_fault *vmf);
+extern struct page *vma_readahead(swp_entry_t fentry, gfp_t gfp_mask,
   struct vm_fault *vmf);
 
 /* linux/mm/swapfile.c */
@@ -536,8 +538,13 @@ static inline void put_swap_page(struct page *page, 
swp_entry_t swp)
 {
 }
 
+static inline struct page *cluster_readahead(swp_entry_t, gfp_t gfp_mask
+   struct vm_fault *vmf)
+{
+}
+
 static inline struct page *swapin_readahead(swp_entry_t swp, gfp_t gfp_mask,
-   struct vm_area_struct *vma, unsigned long addr)
+   struct vm_fault *vmf)
 {
return NULL;
 }
@@ -547,7 +554,7 @@ static inline bool swap_use_vma_readahead(void)
return false;
 }
 
-static inline struct page *do_swap_page_readahead(swp_entry_t fentry,
+static inline struct page *vma_readahead(swp_entry_t fentry,
gfp_t gfp_mask, struct vm_fault *vmf)
 {
return NULL;
diff --git a/mm/memory.c b/mm/memory.c
index e955298e4290..ce5e3d7ccc5c 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2889,7 +2889,8 @@ int do_swap_page(struct vm_fault *vmf)
if (si->flags & SWP_SYNCHRONOUS_IO &&
__swap_count(si, entry) == 1) {
/* skip swapcache */
-   page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma, 
vmf->address);
+   page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma,
+   vmf->address);
if (page) {
__SetPageLocked(page);
__SetPageSwapBacked(page);
@@ -2898,12 +2899,8 @@ int do_swap_page(struct vm_fault *vmf)
swap_readpage(page, true);
}
} else {
-   if (swap_use_vma_readahead())
-   page = do_swap_page_readahead(entry,
-   GFP_HIGHUSER_MOVABLE, vmf);
-   else
-   page = swapin_readahead(entry,
-  GFP_HIGHUSER_MOVABLE, vma, vmf->address);
+   page = swapin_readahead(entry, GFP_HIGHUSER_MOVABLE,
+   vmf);
swapcache = page;
}
 
diff --git a/mm/shmem.c b/mm/shmem.c
index 62dfdc097e44..2522bc0958e1 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1413,9 +1413,12 @@ static struct page *shmem_swapin(swp_entry_t swap, gfp_t 
gfp,
 {
struct vm_area_struct pvma;
struct page *page;
+   struct vm_fault vmf;
 
shmem_pseudo_vma_init(&pvma, info, index);
-   page = swapin_readahead(swap, gfp, &pvma, 0);
+   vmf.vma = &pvma;
+   vmf.address = 0;
+   page = cluster_readahead(swap, gfp, &vmf);
shmem_pseudo_vma_destroy(&pvma);
 
return page;
diff --git a/mm/swap_state.c b/mm/swap_state.c
index e3c535fcd2df..5ee53d4ee047 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -538,11 +538,10 @@ static unsigned long swapin_nr_pages(unsigned long offset)
 }
 
 /**
- * swapin_readahead - swap in pages in hope we need them soon
+ * cluster_readahead - swap in pages in hope we need them soon
  * @entry: swap entry of this memory
  * @gfp_mask: memory allocation flags
- * @vma: user vma this address belongs to
- * @addr: target address for mempolicy
+ * @vmf: fault information
  *
  * Returns the struct page for entry and addr, after queueing swapin.
  *
@@ -556,8 +555,8 @@ static unsigned long swapin_nr_pages(unsigned long offset)
  *
  * Caller must hold down_read 

Re: xfs: list corruption in xfs_setup_inode()

2017-10-31 Thread Dave Chinner
On Tue, Oct 31, 2017 at 09:43:03PM -0700, Cong Wang wrote:
> On Tue, Oct 31, 2017 at 8:05 PM, Dave Chinner  wrote:
> > On Tue, Oct 31, 2017 at 06:51:08PM -0700, Cong Wang wrote:
> >> >> Please let me know if I can provide any other information.
> >> >
> >> > How do you reproduce the problem?
> >>
> >> The warning is reported via ABRT email, we don't know what was
> >> happening at the time of crash.
> >
> > Which makes it even harder to track down. Perhaps you should
> > configure the box to crashdump on such a failure and then we
> > can do some post-failure forensic analysis...
> 
> Yeah.
> 
> We are trying to make kdump working, but even if kdump works
> we still can't turn on panic_on_warn since this is production
> machine.

Hmmm. Ok, maybe you could leave a trace of the xfs_iget* trace
points running and check the log tail for unusual events around the
time of the next crash. e.g. xfs_iget_reclaim_fail events. That
might point us to a potential interaction we can look at more
closely. I'd also suggest slab poisoning as well, as that will
catch other lifecycle problems that could be causing list
corruptions such as use-after-free.

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com


Re: [PATCH 1/2] bpf: add a bpf_override_function helper

2017-10-31 Thread Alexei Starovoitov

On 10/31/17 8:45 AM, Josef Bacik wrote:

From: Josef Bacik 

Error injection is sloppy and very ad-hoc.  BPF could fill this niche
perfectly with it's kprobe functionality.  We could make sure errors are
only triggered in specific call chains that we care about with very
specific situations.  Accomplish this with the bpf_override_funciton
helper.  This will modify the probe'd callers return value to the
specified value and set the PC to an override function that simply
returns, bypassing the originally probed function.  This gives us a nice
clean way to implement systematic error injection for all of our code
paths.

Signed-off-by: Josef Bacik 
---
 arch/Kconfig |  3 +++
 arch/x86/Kconfig |  1 +
 arch/x86/include/asm/kprobes.h   |  4 
 arch/x86/include/asm/ptrace.h|  5 +
 arch/x86/kernel/kprobes/ftrace.c | 14 ++
 include/linux/trace_events.h |  7 +++
 include/uapi/linux/bpf.h |  7 ++-
 kernel/trace/Kconfig | 11 +++
 kernel/trace/bpf_trace.c | 30 
 kernel/trace/trace_kprobe.c  | 42 +---
 10 files changed, 116 insertions(+), 8 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index d789a89cb32c..4fb618082259 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -195,6 +195,9 @@ config HAVE_OPTPROBES
 config HAVE_KPROBES_ON_FTRACE
bool

+config HAVE_KPROBE_OVERRIDE
+   bool
+
 config HAVE_NMI
bool

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 971feac13506..5126d2750dd0 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -152,6 +152,7 @@ config X86
select HAVE_KERNEL_XZ
select HAVE_KPROBES
select HAVE_KPROBES_ON_FTRACE
+   select HAVE_KPROBE_OVERRIDE
select HAVE_KRETPROBES
select HAVE_KVM
select HAVE_LIVEPATCH   if X86_64
diff --git a/arch/x86/include/asm/kprobes.h b/arch/x86/include/asm/kprobes.h
index 6cf65437b5e5..c6c3b1f4306a 100644
--- a/arch/x86/include/asm/kprobes.h
+++ b/arch/x86/include/asm/kprobes.h
@@ -67,6 +67,10 @@ extern const int kretprobe_blacklist_size;
 void arch_remove_kprobe(struct kprobe *p);
 asmlinkage void kretprobe_trampoline(void);

+#ifdef CONFIG_KPROBES_ON_FTRACE
+extern void arch_ftrace_kprobe_override_function(struct pt_regs *regs);
+#endif
+
 /* Architecture specific copy of original instruction*/
 struct arch_specific_insn {
/* copy of the original instruction */
diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h
index 91c04c8e67fa..f04e71800c2f 100644
--- a/arch/x86/include/asm/ptrace.h
+++ b/arch/x86/include/asm/ptrace.h
@@ -108,6 +108,11 @@ static inline unsigned long regs_return_value(struct 
pt_regs *regs)
return regs->ax;
 }

+static inline void regs_set_return_value(struct pt_regs *regs, unsigned long 
rc)
+{
+   regs->ax = rc;
+}
+
 /*
  * user_mode(regs) determines whether a register set came from user
  * mode.  On x86_32, this is true if V8086 mode was enabled OR if the
diff --git a/arch/x86/kernel/kprobes/ftrace.c b/arch/x86/kernel/kprobes/ftrace.c
index 041f7b6dfa0f..3c455bf490cb 100644
--- a/arch/x86/kernel/kprobes/ftrace.c
+++ b/arch/x86/kernel/kprobes/ftrace.c
@@ -97,3 +97,17 @@ int arch_prepare_kprobe_ftrace(struct kprobe *p)
p->ainsn.boostable = false;
return 0;
 }
+
+asmlinkage void override_func(void);
+asm(
+   ".type override_func, @function\n"
+   "override_func:\n"
+   "  ret\n"
+   ".size override_func, .-override_func\n"
+);
+
+void arch_ftrace_kprobe_override_function(struct pt_regs *regs)
+{
+   regs->ip = (unsigned long)&override_func;
+}
+NOKPROBE_SYMBOL(arch_ftrace_kprobe_override_function);
diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
index fc6aeca945db..9179f109c49b 100644
--- a/include/linux/trace_events.h
+++ b/include/linux/trace_events.h
@@ -521,7 +521,14 @@ do {   
\
 #ifdef CONFIG_PERF_EVENTS
 struct perf_event;

+enum {
+   BPF_STATE_NORMAL_KPROBE = 0,
+   BPF_STATE_FTRACE_KPROBE,
+   BPF_STATE_MODIFIED_PC,
+};
+
 DECLARE_PER_CPU(struct pt_regs, perf_trace_regs);
+DECLARE_PER_CPU(int, bpf_kprobe_state);

 extern int  perf_trace_init(struct perf_event *event);
 extern void perf_trace_destroy(struct perf_event *event);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 0b7b54d898bd..1ad5b87a42f6 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -673,6 +673,10 @@ union bpf_attr {
  * @buf: buf to fill
  * @buf_size: size of the buf
  * Return : 0 on success or negative error code
+ *
+ * int bpf_override_return(pt_regs, rc)
+ * @pt_regs: pointer to struct pt_regs
+ * @rc: the return value to set
  */
 #define __BPF_FUNC_MAPPER(FN)  \
FN(unspec), \
@@ -732,7 +736,8 @@ union bpf_attr {
 

Re: xfs: list corruption in xfs_setup_inode()

2017-10-31 Thread Cong Wang
On Tue, Oct 31, 2017 at 8:05 PM, Dave Chinner  wrote:
> On Tue, Oct 31, 2017 at 06:51:08PM -0700, Cong Wang wrote:
>> On Mon, Oct 30, 2017 at 5:33 PM, Dave Chinner  wrote:
>> > On Mon, Oct 30, 2017 at 02:55:43PM -0700, Cong Wang wrote:
>> >> Hello,
>> >>
>> >> We triggered a list corruption (double add) warning below on our 4.9
>> >> kernel (the 4.9 kernel we use is based on -stable release, with only a
>> >> few unrelated networking backports):
> ...
>> >> 4.9.34.el7.x86_64 #1
>> >> Hardware name: TYAN S5512/S5512, BIOS V8.B13 03/20/2014
>> >>  b0d48a0abb30 8e389f47 b0d48a0abb80 
>> >>  b0d48a0abb70 8e08989b 0024 8d9d691e0aa0
>> >>  8d9d7a716608 8d9d691e0aa0 4000 8d9d7de6d800
>> >> Call Trace:
>> >>  [] dump_stack+0x4d/0x66
>> >>  [] __warn+0xcb/0xf0
>> >>  [] warn_slowpath_fmt+0x5f/0x80
>> >>  [] __list_add+0xac/0xb0
>> >>  [] inode_sb_list_add+0x3b/0x50
>> >>  [] xfs_setup_inode+0x2c/0x170 [xfs]
>> >>  [] xfs_ialloc+0x317/0x5c0 [xfs]
>> >>  [] xfs_dir_ialloc+0x77/0x220 [xfs]
>> >
>> > Inode allocation, so should be a new inode straight from the slab
>> > cache. THat implies memory corruption of some kind. Please turn on
>> > slab poisoning and try to reproduce.
>>
>> Are you sure? xfs_iget() seems searching in a cache before allocating
>> a new one:
>
> /me sighs
>
> You started with "I don't know the XFS code very well", so I omitted
> the complexity of describing about 10 different corner cases where
> we /could/ find the unlinked inode still in the cache via the
> lookup. But they aren't common cases - the common case in the real
> world is allocation of cache cold inodes. IOWs: "so should be a new
> inode straight from the slab cache".
>
> So, yes, we could find the old unlinked inode still cached in the
> XFS inode cache, but I don't have the time to explain how RCU lookup
> code works to everyone who reports a bug.

Oh, sorry about it. I understand it now.


>
> All you need to understand is that all of this happens below the VFS
> and so inodes being reclaimed or newly allocated the in-cache inode
> should never, ever be on the VFS sb inode list.
>

OK.


>> >>  [] ? down_write+0x12/0x40
>> >>  [] xfs_create+0x482/0x760 [xfs]
>> >>  [] xfs_generic_create+0x21e/0x2c0 [xfs]
>> >>  [] xfs_vn_mknod+0x14/0x20 [xfs]
>> >>  [] xfs_vn_mkdir+0x16/0x20 [xfs]
>> >>  [] vfs_mkdir+0xe8/0x140
>> >>  [] SyS_mkdir+0x7a/0xf0
>> >>  [] entry_SYSCALL_64_fastpath+0x13/0x94
>> >>
>> >> _Without_ looking deeper, it seems this warning could be shut up by:
>> >>
>> >> --- a/fs/xfs/xfs_icache.c
>> >> +++ b/fs/xfs/xfs_icache.c
>> >> @@ -1138,6 +1138,8 @@ xfs_reclaim_inode(
>> >> xfs_iunlock(ip, XFS_ILOCK_EXCL);
>> >>
>> >> XFS_STATS_INC(ip->i_mount, xs_ig_reclaims);
>> >> +
>> >> +   inode_sb_list_del(VFS_I(ip));
>> >>
>> >> with properly exporting inode_sb_list_del(). Does this make any sense?
>> >
>> > No, because by this stage the inode has already been removed from
>> > the superblock indoe list. Doing this sort of thing here would just
>> > paper over whatever the underlying problem might be.
>>
>>
>> For me, it looks like the inode in the cache pag->pag_ici_root
>> is not removed from sb list before removing from cache.
>
> Sure, we have list corruption. Where we detect that corruption
> implies nothing about the cause of the list corruption. The two
> events are not connected in any way. Clearing that VFS list here
> does nothing to fix the problem causing the list corruption to
> occur.

OK.

>
>> >> Please let me know if I can provide any other information.
>> >
>> > How do you reproduce the problem?
>>
>> The warning is reported via ABRT email, we don't know what was
>> happening at the time of crash.
>
> Which makes it even harder to track down. Perhaps you should
> configure the box to crashdump on such a failure and then we
> can do some post-failure forensic analysis...

Yeah.

We are trying to make kdump working, but even if kdump works
we still can't turn on panic_on_warn since this is production machine.


Thanks!


Re: [PATCH v2] sched/sysctl: Fix attributes of some extern declarations

2017-10-31 Thread Nick Desaulniers
> El Tue, Oct 30, 2017 at 10:57:58AM +0100 Ingo Molnar ha dit:
>> So I hate this change, because it pointlessly duplicates an attribute that 
>> should
>> only matter at the definition site.
>
> It's certainly not ideal, and then again essentially the same is done
> in kernel/sched/sched.h, just that here the specific attribute is
> hidden behind const_debug.
>
>> The Clang warning:
>>
>> >   kernel/sched/sched.h:1618:33: warning: section attribute is specified on
>> > redeclared variable [-Wsection]
>>
>> suggests that the -Wsection warning can be turned off. The Clang build should
>> probably do that.

Naive question: can these definitions be hoisted to include/linux/sched.h?


[PATCH V10 1/2] kasan: use %pK to print addresses instead of %p

2017-10-31 Thread Tobin C. Harding
In preparation for hashing addresses printed using %p. We need the
actual address for error reporting in kasan.

Use %pK instead of %p to print addresses.

Signed-off-by: Tobin C. Harding 
---
 mm/kasan/report.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index 6bcfb01ba038..ad042f025a1a 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -134,7 +134,7 @@ static void print_error_description(struct 
kasan_access_info *info)
 
pr_err("BUG: KASAN: %s in %pS\n",
bug_type, (void *)info->ip);
-   pr_err("%s of size %zu at addr %p by task %s/%d\n",
+   pr_err("%s of size %zu at addr %pK by task %s/%d\n",
info->is_write ? "Write" : "Read", info->access_size,
info->access_addr, current->comm, task_pid_nr(current));
 }
@@ -206,7 +206,7 @@ static void describe_object_addr(struct kmem_cache *cache, 
void *object,
const char *rel_type;
int rel_bytes;
 
-   pr_err("The buggy address belongs to the object at %p\n"
+   pr_err("The buggy address belongs to the object at %pK\n"
   " which belongs to the cache %s of size %d\n",
object, cache->name, cache->object_size);
 
@@ -225,7 +225,7 @@ static void describe_object_addr(struct kmem_cache *cache, 
void *object,
}
 
pr_err("The buggy address is located %d bytes %s of\n"
-  " %d-byte region [%p, %p)\n",
+  " %d-byte region [%pK, %pK)\n",
rel_bytes, rel_type, cache->object_size, (void *)object_addr,
(void *)(object_addr + cache->object_size));
 }
@@ -302,7 +302,7 @@ static void print_shadow_for_address(const void *addr)
char shadow_buf[SHADOW_BYTES_PER_ROW];
 
snprintf(buffer, sizeof(buffer),
-   (i == 0) ? ">%p: " : " %p: ", kaddr);
+   (i == 0) ? ">%pK: " : " %pK: ", kaddr);
/*
 * We should not pass a shadow pointer to generic
 * function, because generic functions may try to
-- 
2.7.4



[PATCH V10 2/2] printk: hash addresses printed with %p

2017-10-31 Thread Tobin C. Harding
Currently there are many places in the kernel where addresses are being
printed using an unadorned %p. Kernel pointers should be printed using
%pK allowing some control via the kptr_restrict sysctl. Exposing addresses
gives attackers sensitive information about the kernel layout in memory.

We can reduce the attack surface by hashing all addresses printed with
%p. This will of course break some users, forcing code printing needed
addresses to be updated.

For what it's worth, usage of unadorned %p can be broken down as
follows (thanks to Joe Perches).

$ git grep -E '%p[^A-Za-z0-9]' | cut -f1 -d"/" | sort | uniq -c
   1084 arch
 20 block
 10 crypto
 32 Documentation
   8121 drivers
   1221 fs
143 include
101 kernel
 69 lib
100 mm
   1510 net
 40 samples
  7 scripts
 11 security
166 sound
152 tools
  2 virt

Add function ptr_to_id() to map an address to a 32 bit unique
identifier. Hash any unadorned usage of specifier %p and any malformed
specifiers.

Signed-off-by: Tobin C. Harding 

---
 Documentation/printk-formats.txt |  17 +++-
 lib/test_printf.c| 108 +++-
 lib/vsprintf.c   | 176 ---
 3 files changed, 213 insertions(+), 88 deletions(-)

diff --git a/Documentation/printk-formats.txt b/Documentation/printk-formats.txt
index 361789df51ec..ec7deb80d035 100644
--- a/Documentation/printk-formats.txt
+++ b/Documentation/printk-formats.txt
@@ -5,6 +5,9 @@ How to get printk format specifiers right
 :Author: Randy Dunlap 
 :Author: Andrew Murray 
 
+Please do not print kernel addresses using %x. Exposing kernel addresses to
+user space leaks sensitive information that increases the attack surface of the
+kernel. In order to print pointers, please see 'Pointer Types' below.
 
 Integer types
 =
@@ -45,6 +48,18 @@ return from vsnprintf.
 Raw pointer value SHOULD be printed with %p. The kernel supports
 the following extended format specifiers for pointer types:
 
+Pointer Types
+=
+
+Pointers printed without a specifier extension (i.e unadorned %p) are hashed
+to give a unique identifier without leaking kernel addresses to user space.
+If you _really_ want to see the address please use %pK (see 'Kernel Pointers'
+below). On 64 bit machines the first 32 bits are zeroed.
+
+::
+
+   %p  abcdef12 or abcdef12
+
 Symbols/Function Pointers
 =
 
@@ -91,7 +106,7 @@ Kernel Pointers
 
 ::
 
-   %pK 0x01234567 or 0x0123456789abcdef
+   %pK 01234567 or 0123456789abcdef
 
 For printing kernel pointers which should be hidden from unprivileged
 users. The behaviour of ``%pK`` depends on the ``kptr_restrict sysctl`` - see
diff --git a/lib/test_printf.c b/lib/test_printf.c
index 563f10e6876a..71ebfa43ad05 100644
--- a/lib/test_printf.c
+++ b/lib/test_printf.c
@@ -24,24 +24,6 @@
 #define PAD_SIZE 16
 #define FILL_CHAR '$'
 
-#define PTR1 ((void*)0x01234567)
-#define PTR2 ((void*)(long)(int)0xfedcba98)
-
-#if BITS_PER_LONG == 64
-#define PTR1_ZEROES "0"
-#define PTR1_SPACES " "
-#define PTR1_STR "1234567"
-#define PTR2_STR "fedcba98"
-#define PTR_WIDTH 16
-#else
-#define PTR1_ZEROES "0"
-#define PTR1_SPACES " "
-#define PTR1_STR "1234567"
-#define PTR2_STR "fedcba98"
-#define PTR_WIDTH 8
-#endif
-#define PTR_WIDTH_STR stringify(PTR_WIDTH)
-
 static unsigned total_tests __initdata;
 static unsigned failed_tests __initdata;
 static char *test_buffer __initdata;
@@ -217,30 +199,79 @@ test_string(void)
test("a  |   |   ", "%-3.s|%-3.0s|%-3.*s", "a", "b", 0, "c");
 }
 
+#define PLAIN_BUF_SIZE 64  /* leave some space so we don't oops */
+
+#if BITS_PER_LONG == 64
+
+#define PTR_WIDTH 16
+#define PTR ((void *)0x0123456789ab)
+#define PTR_STR "0123456789ab"
+#define ZEROS ""   /* hex 32 zero bits */
+
+static int __init
+plain_format(void)
+{
+   char buf[PLAIN_BUF_SIZE];
+   int nchars;
+
+   nchars = snprintf(buf, PLAIN_BUF_SIZE, "%p", PTR);
+
+   if (nchars != PTR_WIDTH || strncmp(buf, ZEROS, strlen(ZEROS)) != 0)
+   return -1;
+
+   return 0;
+}
+
+#else
+
+#define PTR_WIDTH 8
+#define PTR ((void *)0x456789ab)
+#define PTR_STR "456789ab"
+
+static int __init
+plain_format(void)
+{
+   /* Format is implicitly tested for 32 bit machines by plain_hash() */
+   return 0;
+}
+
+#endif /* BITS_PER_LONG == 64 */
+
+static int __init
+plain_hash(void)
+{
+   char buf[PLAIN_BUF_SIZE];
+   int nchars;
+
+   nchars = snprintf(buf, PLAIN_BUF_SIZE, "%p", PTR);
+
+   if (nchars != PTR_WIDTH || strncmp(buf, PTR_STR, PTR_WIDTH) == 0)
+   return -1;
+
+   return 0;
+}
+
+/*
+ * We can't use test() to test %p because we don't know what output to expect
+ * after an address is hashed.
+ */
 static void __init
 plain(void)
 {
-   test(PTR1_ZEROES PTR1_STR " " PTR2_STR, "%p %p", PTR1, PTR2);
-   /*
-*

[PATCH V10 0/2] printk: hash addresses printed with %p

2017-10-31 Thread Tobin C. Harding
Currently there are many places in the kernel where addresses are being
printed using an unadorned %p. Kernel pointers should be printed using
%pK allowing some control via the kptr_restrict sysctl. Exposing
addresses gives attackers sensitive information about the kernel layout
in memory.

We can reduce the attack surface by hashing all addresses printed with
%p. This will of course break some users, forcing code printing needed
addresses to be updated.

This version adds testing, this is my first effort at kernel unit
testing. Modules in `lib` don't seem contained within a selftest target
so in order to incrementally develop the tests I implemented the tests
in `lib/test_printf.c`, built with `make M=lib` and then to insert the
module, instead of running selftest, I spun up a VM and inserted the
module manually. Comments or suggestions much appreciated.

Here is the behaviour that this series implements.

For kpt_restrict==0

Randomness not ready:
  printed with %p: (ptrval) # NOTE: with padding
Valid pointer:
  printed with %pK: deadbeefdeadbeef
  printed with %p:  deadbeef
  malformed specifier (eg %i):  deadbeef
NULL pointer:
  printed with %pK: 
  printed with %p:   (null) # NOTE: with padding
  malformed specifier (eg %i):   (null)

For kpt_restrict==2

Valid pointer:
  printed with %pK: 

All other output as for kptr_restrict==0

V10:
 - Add patch so KASAN uses %pK instead of %p. 
 - Add documentation to Documentation/printk-formats.txt
 - Add tests to lib/test_printf.c
 - Change "(pointer value)" -> "(ptrval)" to fit within columns on 32
   bit machines.

V9:
 - Drop the initial patch from V8, leaving null pointer handling as is.
 - Print the hashed ID _without_ a '0x' suffix.
 - Mask the first 32 bits of the hashed ID to all zeros on 64 bit
   architectures.

V8:
 - Add second patch cleaning up null pointer printing in pointer()
 - Move %pK handling to separate function, further cleaning up pointer()
 - Move ptr_to_id() call outside of switch statement making hashing
   the default behaviour (including malformed specifiers).
 - Remove use of static_key, replace with simple boolean.

V7:
 - Use tabs instead of spaces (ouch!).

V6:
 - Use __early_initcall() to fill the SipHash key.
 - Use static keys to guard hashing before the key is available.

V5:
 - Remove spin lock.
 - Add Jason A. Donenfeld to CC list by request.
 - Add Theodore Ts'o to CC list due to comment on previous version.

V4:
 - Remove changes to siphash.{ch}
 - Do word size check, and return value cast, directly in ptr_to_id().
 - Use add_ready_random_callback() to guard call to get_random_bytes()

V3:
 - Use atomic_xchg() to guard setting [random] key.
 - Remove erroneous white space change.

V2:
 - Use SipHash to do the hashing.

The discussion related to this patch has been fragmented. There are
three threads associated with this patch. Email threads by subject:

[PATCH] printk: hash addresses printed with %p
[PATCH 0/3] add %pX specifier
[kernel-hardening] [RFC V2 0/6] add more kernel pointer filter options

Tobin C. Harding (2):
  kasan: use %pK to print addresses instead of %p
  printk: hash addresses printed with %p

 Documentation/printk-formats.txt |  17 +++-
 lib/test_printf.c| 108 +++-
 lib/vsprintf.c   | 176 ---
 mm/kasan/report.c|   8 +-
 4 files changed, 217 insertions(+), 92 deletions(-)

-- 
2.7.4



Re: [PATCH] at24: support eeproms that do not roll over page reads.

2017-10-31 Thread kbuild test robot
Hi Sven,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v4.14-rc7]
[cannot apply to next-20171018]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Sven-Van-Asbroeck/at24-support-eeproms-that-do-not-roll-over-page-reads/20171101-114231
config: tile-allyesconfig (attached as .config)
compiler: tilegx-linux-gcc (GCC) 4.6.2
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=tile 

All warnings (new ones prefixed by >>):

   drivers/misc/eeprom/at24.c: In function 'at24_translate_offset':
>> drivers/misc/eeprom/at24.c:210:12: warning: comparison of distinct pointer 
>> types lacks a cast [enabled by default]

vim +210 drivers/misc/eeprom/at24.c

   185  
   186  /*
   187   * This routine supports chips which consume multiple I2C addresses. It
   188   * computes the addressing information to be used for a given r/w 
request.
   189   * Assumes that sanity checks for offset happened at sysfs-layer.
   190   *
   191   * Slave address and byte offset derive from the offset. Always
   192   * set the byte address; on a multi-master board, another master
   193   * may have changed the chip's "current" address pointer.
   194   *
   195   * In case of chips that don't rollover page reads, truncate the count
   196   * to the nearest page boundary. This might result in the
   197   * at24_eeprom_read_XXX functions reading fewer bytes than requested,
   198   * but this is compensated for in at24_read().
   199   */
   200  static struct i2c_client *at24_translate_offset(struct at24_data *at24,
   201  unsigned int *offset, size_t *count)
   202  {
   203  unsigned int i, bits, remainder;
   204  
   205  bits = (at24->chip.flags & AT24_FLAG_ADDR16) ? 16 : 8;
   206  i = *offset >> bits;
   207  *offset &= AT24_BITMASK(bits);
   208  if ((at24->chip.flags & AT24_FLAG_NO_RDROL) && count) {
   209  remainder = BIT(bits) - *offset;
 > 210  *count = min(*count, remainder);
   211  }
   212  
   213  return at24->client[i];
   214  }
   215  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH] selftests/ftrace: Introduce exit_pass and exit_fail

2017-10-31 Thread Masami Hiramatsu
On Tue, 31 Oct 2017 17:44:32 -0400
Steven Rostedt  wrote:

> On Tue, 31 Oct 2017 23:51:42 +0900
> Masami Hiramatsu  wrote:
> 
> > diff --git a/tools/testing/selftests/ftrace/test.d/kprobe/busy_check.tc 
> > b/tools/testing/selftests/ftrace/test.d/kprobe/busy_check.tc
> > index 74507db8bbc8..b8701fa0b8f2 100644
> > --- a/tools/testing/selftests/ftrace/test.d/kprobe/busy_check.tc
> > +++ b/tools/testing/selftests/ftrace/test.d/kprobe/busy_check.tc
> > @@ -8,7 +8,7 @@ echo > kprobe_events
> >  echo p:myevent _do_fork > kprobe_events
> >  test -d events/kprobes/myevent
> >  echo 1 > events/kprobes/myevent/enable
> > -echo > kprobe_events && exit 1 # this must fail
> > +echo > kprobe_events && exit_fail
> 
> Should we keep the comment about "this must fail", otherwise it may
> look like a mistake. Echoing in kprobe_events returns failure here?

Ah, good catch! I misread the comment is for "exit 1"...

Thank you,

> 
> -- Steve
> 
> 
> >  echo 0 > events/kprobes/myevent/enable
> >  echo > kprobe_events # this must succeed
> >  clear_trace


-- 
Masami Hiramatsu 


Re: [PATCH] [irq] Fix boot failure when irqaffinity is passed.

2017-10-31 Thread Rakib Mullick
On Tue, Oct 31, 2017 at 5:29 PM, Ingo Molnar  wrote:
>
>
> Not applied, because this patch causes the following build warning:
>
>   kernel/irq/irqdesc.c:43:6: warning: the address of ‘irq_default_affinity’ 
> will always evaluate as ‘true’ [-Waddress]
>
Ah, sorry I didn't look into the build log. It happened due to removal
of #ifdef's. Now, it's been fixed by using cpumask_available().

> Also, please pick up the improved changelog below for the next version of the
> patch.
>
Thanks for the improved changelog, I have sent a new version here:
https://lkml.org/lkml/2017/11/1/6.

Thanks,
Rakib.


Re: [PATCH] at24: support eeproms that do not roll over page reads.

2017-10-31 Thread kbuild test robot
Hi Sven,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v4.14-rc7]
[cannot apply to next-20171018]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Sven-Van-Asbroeck/at24-support-eeproms-that-do-not-roll-over-page-reads/20171101-114231
config: sparc64-allyesconfig (attached as .config)
compiler: sparc64-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=sparc64 

All warnings (new ones prefixed by >>):

   In file included from drivers/misc/eeprom/at24.c:12:0:
   drivers/misc/eeprom/at24.c: In function 'at24_translate_offset':
   include/linux/kernel.h:790:16: warning: comparison of distinct pointer types 
lacks a cast
 (void) (&min1 == &min2);   \
   ^
   include/linux/kernel.h:799:2: note: in expansion of macro '__min'
 __min(typeof(x), typeof(y),   \
 ^
>> drivers/misc/eeprom/at24.c:210:12: note: in expansion of macro 'min'
  *count = min(*count, remainder);
   ^~~

vim +/min +210 drivers/misc/eeprom/at24.c

  > 12  #include 
13  #include 
14  #include 
15  #include 
16  #include 
17  #include 
18  #include 
19  #include 
20  #include 
21  #include 
22  #include 
23  #include 
24  #include 
25  #include 
26  #include 
27  
28  /*
29   * I2C EEPROMs from most vendors are inexpensive and mostly 
interchangeable.
30   * Differences between different vendor product lines (like Atmel AT24C 
or
31   * MicroChip 24LC, etc) won't much matter for typical read/write access.
32   * There are also I2C RAM chips, likewise interchangeable. One example
33   * would be the PCF8570, which acts like a 24c02 EEPROM (256 bytes).
34   *
35   * However, misconfiguration can lose data. "Set 16-bit memory address"
36   * to a part with 8-bit addressing will overwrite data. Writing with too
37   * big a page size also loses data. And it's not safe to assume that the
38   * conventional addresses 0x50..0x57 only hold eeproms; a PCF8563 RTC
39   * uses 0x51, for just one example.
40   *
41   * Accordingly, explicit board-specific configuration data should be 
used
42   * in almost all cases. (One partial exception is an SMBus used to 
access
43   * "SPD" data for DRAM sticks. Those only use 24c02 EEPROMs.)
44   *
45   * So this driver uses "new style" I2C driver binding, expecting to be
46   * told what devices exist. That may be in arch/X/mach-Y/board-Z.c or
47   * similar kernel-resident tables; or, configuration data coming from
48   * a bootloader.
49   *
50   * Other than binding model, current differences from "eeprom" driver 
are
51   * that this one handles write access and isn't restricted to 24c02 
devices.
52   * It also handles larger devices (32 kbit and up) with two-byte 
addresses,
53   * which won't work on pure SMBus systems.
54   */
55  
56  struct at24_data {
57  struct at24_platform_data chip;
58  int use_smbus;
59  int use_smbus_write;
60  
61  ssize_t (*read_func)(struct at24_data *, char *, unsigned int, 
size_t);
62  ssize_t (*write_func)(struct at24_data *,
63const char *, unsigned int, size_t);
64  
65  /*
66   * Lock protects against activities from other Linux tasks,
67   * but not from changes by other I2C masters.
68   */
69  struct mutex lock;
70  
71  u8 *writebuf;
72  unsigned write_max;
73  unsigned num_addresses;
74  
75  struct nvmem_config nvmem_config;
76  struct nvmem_device *nvmem;
77  
78  /*
79   * Some chips tie up multiple I2C addresses; dummy devices 
reserve
80   * them for us, and we'll use them with SMBus calls.
81   */
82  struct i2c_client *client[];
83  };
84  
85  /*
86   * This parameter is to help this driver avoid blocking other drivers 
out
87   * of I2C for potentially troublesome amounts of time. With a 100 kHz 
I2C
88   * clock, one 256 byte read takes about 1/43 second which is excessive;
89   * but the 1/170 second it takes at 400 kHz may be quite reasonable; and
90   * at 1 MHz (Fm+) a 1/430 second delay could easily be invisible.
91   *
92   * This value is forced to be a power of two so that writes align on 
pages.
93   */
94  static unsigned io_limit = 128;
95  module_param(io_limit, uint, 0);
96  M

Re: linux-next: manual merge of the net-next tree with the net tree

2017-10-31 Thread Cong Wang
On Tue, Oct 31, 2017 at 5:58 PM, Stephen Rothwell  wrote:
> Hi all,
>
> Today's linux-next merge of the net-next tree got a conflict in:
>
>   net/sched/cls_api.c
>
> between commit:
>
>   822e86d997e4 ("net_sched: remove tcf_block_put_deferred()")
>
> from the net tree and commit:
>
>   8c4083b30e56 ("net: sched: add block bind/unbind notif. and extended 
> block_get/put")
>
> from the net-next tree.

Seems good.

Thanks!


Re: [RFC/RFT PATCH 3/6] ACPI / APEI: Replace ioremap_page_range() with fixmap

2017-10-31 Thread gengdongjiu
On 2017/10/31 23:38, James Morse wrote:
> CC'd people I've seen posting CPER log fragments, could you give this a
> test on your platforms?
Thanks for the fixing, not found obviously issue.



[PATCH] irq/core: Fix boot crash when the irqaffinity= boot parameter is passed on CPUMASK_OFFSTACK=y kernels(v1)

2017-10-31 Thread Rakib Mullick
When the irqaffinity= kernel parameter is passed in a CPUMASK_OFFSTACK=y
kernel, it fails to boot, because zalloc_cpumask_var() cannot be used before
initializing the slab allocator to allocate a cpumask.

So, use alloc_bootmem_cpumask_var() instead.

Also do some cleanups while at it: in init_irq_default_affinity() remove
an unnecessary #ifdef.

Change since v0:
* Fix build warning.

Signed-off-by: Rakib Mullick 
Cc: Thomas Gleixner 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/20171026045800.27087-1-rakib.mull...@gmail.com
Signed-off-by: Ingo Molnar 
---
Patch created against -rc7 (commit 0b07194bb55ed836c2). I found tip had a merge
conflict, so used -rc7 instead.

 kernel/irq/irqdesc.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index 82afb7e..e97bbae 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -27,7 +27,7 @@ static struct lock_class_key irq_desc_lock_class;
 #if defined(CONFIG_SMP)
 static int __init irq_affinity_setup(char *str)
 {
-   zalloc_cpumask_var(&irq_default_affinity, GFP_NOWAIT);
+   alloc_bootmem_cpumask_var(&irq_default_affinity);
cpulist_parse(str, irq_default_affinity);
/*
 * Set at least the boot cpu. We don't want to end up with
@@ -40,10 +40,8 @@ __setup("irqaffinity=", irq_affinity_setup);
 
 static void __init init_irq_default_affinity(void)
 {
-#ifdef CONFIG_CPUMASK_OFFSTACK
-   if (!irq_default_affinity)
+   if (!cpumask_available(irq_default_affinity))
zalloc_cpumask_var(&irq_default_affinity, GFP_NOWAIT);
-#endif
if (cpumask_empty(irq_default_affinity))
cpumask_setall(irq_default_affinity);
 }
-- 
2.9.3



linux-next: build warning after merge of the sound-asoc tree

2017-10-31 Thread Stephen Rothwell
Hi all,

After merging the sound-asoc tree, today's linux-next build (x86_64
allmodconfig) produced this warning:

sound/soc/stm/stm32_sai_sub.c: In function 'stm32_sai_hw_params':
sound/soc/stm/stm32_sai_sub.c:485:7: warning: 'cr1' may be used uninitialized 
in this function [-Wmaybe-uninitialized]
   cr1 |= SAI_XCR1_DS_SET(SAI_DATASIZE_8);
   ^
sound/soc/stm/stm32_sai_sub.c:469:6: note: 'cr1' was declared here
  int cr1, cr1_mask, ret;
  ^

Introduced by commit

  61fb4ff70377 ("ASoC: stm32: sai: Move static settings to DAI init")

-- 
Cheers,
Stephen Rothwell


[PATCH v3 1/2] gpio: gpiolib: Generalise state persistence beyond sleep

2017-10-31 Thread Andrew Jeffery
General support for state persistence is added to gpiolib with the
introduction of a new pinconf parameter to propagate the request to
hardware. The existing persistence support for sleep is adapted to
include hardware support if the GPIO driver provides it. Persistence
continues to be enabled by default; in-kernel consumers can opt out, but
userspace (currently) does not have a choice.

The *_SLEEP_MAY_LOSE_VALUE and *_SLEEP_MAINTAIN_VALUE symbols are
renamed, dropping the SLEEP prefix to reflect that the concept is no
longer sleep-specific.  I feel that renaming to just *_MAY_LOSE_VALUE
could initially be misinterpreted, so I've further changed the symbols
to *_TRANSITORY and *_PERSISTENT to address this.

The sysfs interface is modified only to keep consistency with the
chardev interface in enforcing persistence for userspace exports.

Signed-off-by: Andrew Jeffery 
---
 drivers/gpio/gpiolib-of.c   |  6 ++--
 drivers/gpio/gpiolib-sysfs.c| 14 +---
 drivers/gpio/gpiolib.c  | 61 ++---
 drivers/gpio/gpiolib.h  |  2 +-
 include/dt-bindings/gpio/gpio.h |  6 ++--
 include/linux/gpio/consumer.h   |  8 +
 include/linux/gpio/machine.h|  4 +--
 include/linux/of_gpio.h |  2 +-
 include/linux/pinctrl/pinconf-generic.h |  2 ++
 9 files changed, 87 insertions(+), 18 deletions(-)

diff --git a/drivers/gpio/gpiolib-of.c b/drivers/gpio/gpiolib-of.c
index e0d59e61b52f..4a2b8d3397c7 100644
--- a/drivers/gpio/gpiolib-of.c
+++ b/drivers/gpio/gpiolib-of.c
@@ -153,8 +153,8 @@ struct gpio_desc *of_find_gpio(struct device *dev, const 
char *con_id,
*flags |= GPIO_OPEN_SOURCE;
}
 
-   if (of_flags & OF_GPIO_SLEEP_MAY_LOSE_VALUE)
-   *flags |= GPIO_SLEEP_MAY_LOSE_VALUE;
+   if (of_flags & OF_GPIO_TRANSITORY)
+   *flags |= GPIO_TRANSITORY;
 
return desc;
 }
@@ -214,6 +214,8 @@ static struct gpio_desc *of_parse_own_gpio(struct 
device_node *np,
 
if (xlate_flags & OF_GPIO_ACTIVE_LOW)
*lflags |= GPIO_ACTIVE_LOW;
+   if (xlate_flags & OF_GPIO_TRANSITORY)
+   *lflags |= GPIO_TRANSITORY;
 
if (of_property_read_bool(np, "input"))
*dflags |= GPIOD_IN;
diff --git a/drivers/gpio/gpiolib-sysfs.c b/drivers/gpio/gpiolib-sysfs.c
index 3f454eaf2101..0bd472ffb072 100644
--- a/drivers/gpio/gpiolib-sysfs.c
+++ b/drivers/gpio/gpiolib-sysfs.c
@@ -474,11 +474,15 @@ static ssize_t export_store(struct class *class,
status = -ENODEV;
goto done;
}
-   status = gpiod_export(desc, true);
-   if (status < 0)
-   gpiod_free(desc);
-   else
-   set_bit(FLAG_SYSFS, &desc->flags);
+
+   status = gpiod_set_transitory(desc, false);
+   if (!status) {
+   status = gpiod_export(desc, true);
+   if (status < 0)
+   gpiod_free(desc);
+   else
+   set_bit(FLAG_SYSFS, &desc->flags);
+   }
 
 done:
if (status)
diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index 3827f0767101..a5e81dc03aba 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -503,6 +503,10 @@ static int linehandle_create(struct gpio_device *gdev, 
void __user *ip)
if (lflags & GPIOHANDLE_REQUEST_OPEN_SOURCE)
set_bit(FLAG_OPEN_SOURCE, &desc->flags);
 
+   ret = gpiod_set_transitory(desc, false);
+   if (ret < 0)
+   goto out_free_descs;
+
/*
 * Lines have to be requested explicitly for input
 * or output, else the line will be treated "as is".
@@ -2424,6 +2428,49 @@ int gpiod_set_debounce(struct gpio_desc *desc, unsigned 
debounce)
 EXPORT_SYMBOL_GPL(gpiod_set_debounce);
 
 /**
+ * gpiod_set_transitory - Lose or retain GPIO state on suspend or reset
+ * @desc: descriptor of the GPIO for which to configure persistence
+ * @transitory: True to lose state on suspend or reset, false for persistence
+ *
+ * Returns:
+ * 0 on success, otherwise a negative error code.
+ */
+int gpiod_set_transitory(struct gpio_desc *desc, bool transitory)
+{
+   struct gpio_chip *chip;
+   unsigned long packed;
+   int gpio;
+   int rc;
+
+   /*
+* Handle FLAG_TRANSITORY first, enabling queries to gpiolib for
+* persistence state.
+*/
+   if (transitory)
+   set_bit(FLAG_TRANSITORY, &desc->flags);
+   else
+   clear_bit(FLAG_TRANSITORY, &desc->flags);
+
+   /* If the driver supports it, set the persistence state now */
+   chip = desc->gdev->chip;
+   if (!chip->set_config)
+   return 0;
+
+   packed = pinconf_to_config_packed(PIN_CONFIG_PERSIST_STATE,
+ !transitory);
+   gpio = gpio

[PATCH v3 2/2] gpio: aspeed: Add support for reset tolerance

2017-10-31 Thread Andrew Jeffery
Use the new pinconf parameter for state persistence to expose the
associated capability of the Aspeed GPIO controller.

Signed-off-by: Andrew Jeffery 
---
 drivers/gpio/gpio-aspeed.c | 39 +--
 1 file changed, 37 insertions(+), 2 deletions(-)

diff --git a/drivers/gpio/gpio-aspeed.c b/drivers/gpio/gpio-aspeed.c
index 00dc1c020198..3125dcb9211d 100644
--- a/drivers/gpio/gpio-aspeed.c
+++ b/drivers/gpio/gpio-aspeed.c
@@ -60,6 +60,7 @@ struct aspeed_gpio_bank {
uint16_tval_regs;
uint16_tirq_regs;
uint16_tdebounce_regs;
+   uint16_ttolerance_regs;
const char  names[4][3];
 };
 
@@ -70,48 +71,56 @@ static const struct aspeed_gpio_bank aspeed_gpio_banks[] = {
.val_regs = 0x,
.irq_regs = 0x0008,
.debounce_regs = 0x0040,
+   .tolerance_regs = 0x001c,
.names = { "A", "B", "C", "D" },
},
{
.val_regs = 0x0020,
.irq_regs = 0x0028,
.debounce_regs = 0x0048,
+   .tolerance_regs = 0x003c,
.names = { "E", "F", "G", "H" },
},
{
.val_regs = 0x0070,
.irq_regs = 0x0098,
.debounce_regs = 0x00b0,
+   .tolerance_regs = 0x00ac,
.names = { "I", "J", "K", "L" },
},
{
.val_regs = 0x0078,
.irq_regs = 0x00e8,
.debounce_regs = 0x0100,
+   .tolerance_regs = 0x00fc,
.names = { "M", "N", "O", "P" },
},
{
.val_regs = 0x0080,
.irq_regs = 0x0118,
.debounce_regs = 0x0130,
+   .tolerance_regs = 0x012c,
.names = { "Q", "R", "S", "T" },
},
{
.val_regs = 0x0088,
.irq_regs = 0x0148,
.debounce_regs = 0x0160,
+   .tolerance_regs = 0x015c,
.names = { "U", "V", "W", "X" },
},
{
.val_regs = 0x01E0,
.irq_regs = 0x0178,
.debounce_regs = 0x0190,
+   .tolerance_regs = 0x018c,
.names = { "Y", "Z", "AA", "AB" },
},
{
-   .val_regs = 0x01E8,
-   .irq_regs = 0x01A8,
+   .val_regs = 0x01e8,
+   .irq_regs = 0x01a8,
.debounce_regs = 0x01c0,
+   .tolerance_regs = 0x01bc,
.names = { "AC", "", "", "" },
},
 };
@@ -534,6 +543,30 @@ static int aspeed_gpio_setup_irqs(struct aspeed_gpio *gpio,
return 0;
 }
 
+static int aspeed_gpio_reset_tolerance(struct gpio_chip *chip,
+   unsigned int offset, bool enable)
+{
+   struct aspeed_gpio *gpio = gpiochip_get_data(chip);
+   const struct aspeed_gpio_bank *bank;
+   unsigned long flags;
+   u32 val;
+
+   bank = to_bank(offset);
+
+   spin_lock_irqsave(&gpio->lock, flags);
+   val = readl(gpio->base + bank->tolerance_regs);
+
+   if (enable)
+   val |= GPIO_BIT(offset);
+   else
+   val &= ~GPIO_BIT(offset);
+
+   writel(val, gpio->base + bank->tolerance_regs);
+   spin_unlock_irqrestore(&gpio->lock, flags);
+
+   return 0;
+}
+
 static int aspeed_gpio_request(struct gpio_chip *chip, unsigned int offset)
 {
if (!have_gpio(gpiochip_get_data(chip), offset))
@@ -771,6 +804,8 @@ static int aspeed_gpio_set_config(struct gpio_chip *chip, 
unsigned int offset,
param == PIN_CONFIG_DRIVE_OPEN_SOURCE)
/* Return -ENOTSUPP to trigger emulation, as per datasheet */
return -ENOTSUPP;
+   else if (param == PIN_CONFIG_PERSIST_STATE)
+   return aspeed_gpio_reset_tolerance(chip, offset, arg);
 
return -ENOTSUPP;
 }
-- 
2.11.0



[PATCH v3 0/2] gpio: Generalise state persistence

2017-10-31 Thread Andrew Jeffery
Hello,

This series provides an API to configure general GPIO state persistence in
gpiolib. Previously, only sleep persistence was considered, but controllers
like one found in Aspeed BMCs also support persistence of state across
controller resets. There is some prior discussion on v1[1] and the initial
RFC[2], and minor comments on v2[3]. v3 addresses minor issues with comments
and debug statements[4], removing remaining references to reset tolerance.

Please review!

Andrew

[1] https://www.spinics.net/lists/devicetree/msg200027.html
[2] https://www.spinics.net/lists/devicetree/msg199559.html
[3] https://www.spinics.net/lists/kernel/msg2635769.html
[4] https://www.spinics.net/lists/devicetree/msg200040.html

Andrew Jeffery (2):
  gpio: gpiolib: Generalise state persistence beyond sleep
  gpio: aspeed: Add support for reset tolerance

 drivers/gpio/gpio-aspeed.c  | 39 +++--
 drivers/gpio/gpiolib-of.c   |  6 ++--
 drivers/gpio/gpiolib-sysfs.c| 14 +---
 drivers/gpio/gpiolib.c  | 61 ++---
 drivers/gpio/gpiolib.h  |  2 +-
 include/dt-bindings/gpio/gpio.h |  6 ++--
 include/linux/gpio/consumer.h   |  8 +
 include/linux/gpio/machine.h|  4 +--
 include/linux/of_gpio.h |  2 +-
 include/linux/pinctrl/pinconf-generic.h |  2 ++
 10 files changed, 124 insertions(+), 20 deletions(-)

-- 
2.11.0



linux-next: manual merge of the sound-asoc tree with the drm tree

2017-10-31 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the sound-asoc tree got a conflict in:

  drivers/gpu/drm/amd/include/amd_shared.h

between commit:

  cfa289fd4986 ("drm/amdgpu: rename amdgpu_dpm_funcs to amd_pm_funcs")

from the drm tree and commit:

  f674bd281460 ("drm/amdgpu Moving amdgpu asic types to a separate file")

from the sound-asoc tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/gpu/drm/amd/include/amd_shared.h
index de6fc2731b98,3a49fbd8baf8..
--- a/drivers/gpu/drm/amd/include/amd_shared.h
+++ b/drivers/gpu/drm/amd/include/amd_shared.h
@@@ -23,36 -23,9 +23,11 @@@
  #ifndef __AMD_SHARED_H__
  #define __AMD_SHARED_H__
  
- #define AMD_MAX_USEC_TIMEOUT  20  /* 200 ms */
+ #include 
  
 +struct seq_file;
 +
- /*
-  * Supported ASIC types
-  */
- enum amd_asic_type {
-   CHIP_TAHITI = 0,
-   CHIP_PITCAIRN,
-   CHIP_VERDE,
-   CHIP_OLAND,
-   CHIP_HAINAN,
-   CHIP_BONAIRE,
-   CHIP_KAVERI,
-   CHIP_KABINI,
-   CHIP_HAWAII,
-   CHIP_MULLINS,
-   CHIP_TOPAZ,
-   CHIP_TONGA,
-   CHIP_FIJI,
-   CHIP_CARRIZO,
-   CHIP_STONEY,
-   CHIP_POLARIS10,
-   CHIP_POLARIS11,
-   CHIP_POLARIS12,
-   CHIP_VEGA10,
-   CHIP_RAVEN,
-   CHIP_LAST,
- };
+ #define AMD_MAX_USEC_TIMEOUT  20  /* 200 ms */
  
  /*
   * Chip flags


[PATCH 2/2] Documentation: fsl: dspi: Add a compatible string for ls1088a DSPI

2017-10-31 Thread Zhiqiang Hou
From: Hou Zhiqiang 

Add a new compatible string "fsl,ls1088a-dspi".

Signed-off-by: Hou Zhiqiang 
---
 Documentation/devicetree/bindings/spi/spi-fsl-dspi.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/spi/spi-fsl-dspi.txt 
b/Documentation/devicetree/bindings/spi/spi-fsl-dspi.txt
index dcc7eaada511..5fc467211cc6 100644
--- a/Documentation/devicetree/bindings/spi/spi-fsl-dspi.txt
+++ b/Documentation/devicetree/bindings/spi/spi-fsl-dspi.txt
@@ -6,6 +6,7 @@ Required properties:
or
"fsl,ls2080a-dspi" followed by "fsl,ls2085a-dspi"
"fsl,ls1012a-dspi" followed by "fsl,ls1021a-v1.0-dspi"
+   "fsl,ls1088a-dspi" followed by "fsl,ls2085a-dspi"
 - reg : Offset and length of the register set for the device
 - interrupts : Should contain SPI controller interrupt
 - clocks: from common clock binding: handle to dspi clock.
-- 
2.14.1



[PATCH 1/2] arm64: dts: ls1088a: add DT nodes for DSPI support

2017-10-31 Thread Zhiqiang Hou
From: Hou Zhiqiang 

Signed-off-by: Hou Zhiqiang 
---
 arch/arm64/boot/dts/freescale/fsl-ls1088a-qds.dts | 28 +++
 arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi| 13 +++
 2 files changed, 41 insertions(+)

diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1088a-qds.dts 
b/arch/arm64/boot/dts/freescale/fsl-ls1088a-qds.dts
index 30128051d0c0..cf5b85b93ae6 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls1088a-qds.dts
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1088a-qds.dts
@@ -134,6 +134,34 @@
};
 };
 
+&dspi {
+   status = "okay";
+
+   dflash0: n25q128a {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   compatible = "n25q128a11", "jedec,spi-nor";
+   reg = <0>;
+   spi-max-frequency = <300>;
+   };
+
+   dflash1: sst25wf040b {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   compatible = "sst,sst25wf040b", "jedec,spi-nor";
+   reg = <1>;
+   spi-max-frequency = <300>;
+   };
+
+   dflash2: en25s64 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   compatible = "eon,en25s64", "jedec,spi-nor";
+   reg = <2>;
+   spi-max-frequency = <300>;
+   };
+};
+
 &duart0 {
status = "okay";
 };
diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi 
b/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi
index bd80e9a2e67c..f5ed3878abb7 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi
@@ -276,6 +276,19 @@
};
};
 
+   dspi: dspi@210 {
+   compatible = "fsl,ls1088a-dspi", "fsl,ls2085a-dspi";
+   #address-cells = <1>;
+   #size-cells = <0>;
+   reg = <0x0 0x210 0x0 0x1>;
+   interrupts = <0 26 0x4>; /* Level high type */
+   clocks = <&clockgen 4 3>;
+   clock-names = "dspi";
+   spi-num-chipselects = <5>;
+   bus-num = <0>;
+   status = "disabled";
+   };
+
duart0: serial@21c0500 {
compatible = "fsl,ns16550", "ns16550a";
reg = <0x0 0x21c0500 0x0 0x100>;
-- 
2.14.1



[PATCH 0/2] arm64: dts: Add ls1088a DSPI device tree nodes

2017-10-31 Thread Zhiqiang Hou
From: Hou Zhiqiang 

LS1088A reuse LS2085A DSPI driver, this patchset just adds device tree
nodes and adds compatible entry to documentation.

Hou Zhiqiang (2):
  arm64: dts: ls1088a: add DT nodes for DSPI support
  Documentation: fsl: dspi: Add a compatible string for ls1088a DSPI

 .../devicetree/bindings/spi/spi-fsl-dspi.txt   |  1 +
 arch/arm64/boot/dts/freescale/fsl-ls1088a-qds.dts  | 28 ++
 arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi | 13 ++
 3 files changed, 42 insertions(+)

-- 
2.14.1



linux-next: build warning after merge of the drm-msm tree

2017-10-31 Thread Stephen Rothwell
Hi Rob,

After merging the drm-msm tree, today's linux-next build (arm
multi_v7_defconfig) produced this warning:

In file included from include/drm/drm_mm.h:49:0,
 from include/drm/drmP.h:73,
 from drivers/gpu/drm/msm/msm_drv.h:37,
 from drivers/gpu/drm/msm/msm_gpu.h:24,
 from drivers/gpu/drm/msm/msm_gpu.c:18:
drivers/gpu/drm/msm/msm_gpu.c: In function 'msm_gpu_init':
drivers/gpu/drm/msm/msm_gpu.c:780:31: warning: format '%lu' expects argument of 
type 'long unsigned int', but argument 7 has type 'unsigned int' [-Wformat=]
   DRM_DEV_INFO_ONCE(drm->dev, "Only creating %lu ringbuffers\n",
   ^
include/drm/drm_print.h:237:60: note: in definition of macro 'DRM_DEV_INFO'
  drm_dev_printk(dev, KERN_INFO, DRM_UT_NONE, __func__, "", fmt, \
^
drivers/gpu/drm/msm/msm_gpu.c:780:3: note: in expansion of macro 
'DRM_DEV_INFO_ONCE'
   DRM_DEV_INFO_ONCE(drm->dev, "Only creating %lu ringbuffers\n",
   ^

Introduced by commit

  f97decac5f4c ("drm/msm: Support multiple ringbuffers")

-- 
Cheers,
Stephen Rothwell


linux-next: manual merge of the drm-misc tree with the drm tree

2017-10-31 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the drm-misc tree got a conflict in:

  include/drm/drmP.h

between commit:

  e7646f84ad4f ("drm: Add new LEASE debug level")

from the drm tree and commit:

  02c9656b2f0d ("drm: Move debug macros out of drmP.h")

from the drm-misc tree.

I fixed it up (I used the drm-misc version of the file and added the below
merge fix patch) and can carry the fix as necessary. This is now fixed
as far as linux-next is concerned, but any non trivial conflicts should
be mentioned to your upstream maintainer when your tree is submitted for
merging.  You may also want to consider cooperating with the maintainer
of the conflicting tree to minimise any particularly complex conflicts.

From: Stephen Rothwell 
Date: Wed, 1 Nov 2017 14:33:07 +1100
Subject: [PATCH] drm-misc: merge fix up for DEBUG printing macros move

Signed-off-by: Stephen Rothwell 
---
 include/drm/drm_print.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/include/drm/drm_print.h b/include/drm/drm_print.h
index 7b9c86a6ca3e..edcea83a5050 100644
--- a/include/drm/drm_print.h
+++ b/include/drm/drm_print.h
@@ -171,6 +171,7 @@ static inline struct drm_printer drm_debug_printer(const 
char *prefix)
 #define DRM_UT_ATOMIC  0x10
 #define DRM_UT_VBL 0x20
 #define DRM_UT_STATE   0x40
+#define DRM_UT_LEASE   0x80
 
 __printf(6, 7)
 void drm_dev_printk(const struct device *dev, const char *level,
@@ -287,6 +288,9 @@ void drm_printk(const char *level, unsigned int category,
 #define DRM_DEBUG_VBL(fmt, ...)\
drm_printk(KERN_DEBUG, DRM_UT_VBL, fmt, ##__VA_ARGS__)
 
+#define DRM_DEBUG_LEASE(fmt, ...)  \
+   drm_printk(KERN_DEBUG, DRM_UT_LEASE, fmt, ##__VA_ARGS__)
+
 #define _DRM_DEV_DEFINE_DEBUG_RATELIMITED(dev, level, fmt, args...)\
 ({ \
static DEFINE_RATELIMIT_STATE(_rs,  \


-- 
Cheers,
Stephen Rothwell


[PATCH v3] tracing: Allocate mask_str buffer dynamically

2017-10-31 Thread changbin . du
From: Changbin Du 

The default NR_CPUS can be very large, but actual possible nr_cpu_ids
usually is very small. For my x86 distribution, the NR_CPUS is 8192 and
nr_cpu_ids is 4. About 2 pages are wasted.

Most machines don't have so many CPUs, so define a array with NR_CPUS
just wastes memory. So let's allocate the buffer dynamically when need.

The exact buffer size should be:
  DIV_ROUND_UP(nr_cpu_ids, 4) + nr_cpu_ids/32 + 2;

Example output:
  ff,

With this change, the mutext tracing_cpumask_update_lock also can be
removed now, which was used to protect mask_str.

Signed-off-by: Changbin Du 
Cc: Steven Rostedt 

---
v3:
  - remove tracing_cpumask_update_lock which was used to protect mask_str. 
(Rostedt)
v2:
  - remove 'static' declaration.
  - fix buffer size.
---
 kernel/trace/trace.c | 29 +
 1 file changed, 9 insertions(+), 20 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 752e5da..5d2ec80 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -4178,37 +4178,30 @@ static const struct file_operations show_traces_fops = {
.llseek = seq_lseek,
 };
 
-/*
- * The tracer itself will not take this lock, but still we want
- * to provide a consistent cpumask to user-space:
- */
-static DEFINE_MUTEX(tracing_cpumask_update_lock);
-
-/*
- * Temporary storage for the character representation of the
- * CPU bitmask (and one more byte for the newline):
- */
-static char mask_str[NR_CPUS + 1];
-
 static ssize_t
 tracing_cpumask_read(struct file *filp, char __user *ubuf,
 size_t count, loff_t *ppos)
 {
struct trace_array *tr = file_inode(filp)->i_private;
+   char *mask_str;
int len;
 
-   mutex_lock(&tracing_cpumask_update_lock);
+   /* Bitmap, ',' and two more bytes for the newline and '\0'. */
+   len = DIV_ROUND_UP(nr_cpu_ids, 4) + nr_cpu_ids/32 + 2;
+   mask_str = kmalloc(len, GFP_KERNEL);
+   if (!mask_str)
+   return -ENOMEM;
 
-   len = snprintf(mask_str, count, "%*pb\n",
+   len = snprintf(mask_str, len, "%*pb\n",
   cpumask_pr_args(tr->tracing_cpumask));
if (len >= count) {
count = -EINVAL;
goto out_err;
}
-   count = simple_read_from_buffer(ubuf, count, ppos, mask_str, NR_CPUS+1);
+   count = simple_read_from_buffer(ubuf, count, ppos, mask_str, len);
 
 out_err:
-   mutex_unlock(&tracing_cpumask_update_lock);
+   kfree(mask_str);
 
return count;
 }
@@ -4228,8 +4221,6 @@ tracing_cpumask_write(struct file *filp, const char 
__user *ubuf,
if (err)
goto err_unlock;
 
-   mutex_lock(&tracing_cpumask_update_lock);
-
local_irq_disable();
arch_spin_lock(&tr->max_lock);
for_each_tracing_cpu(cpu) {
@@ -4252,8 +4243,6 @@ tracing_cpumask_write(struct file *filp, const char 
__user *ubuf,
local_irq_enable();
 
cpumask_copy(tr->tracing_cpumask, tracing_cpumask_new);
-
-   mutex_unlock(&tracing_cpumask_update_lock);
free_cpumask_var(tracing_cpumask_new);
 
return count;
-- 
2.7.4



Re: [PATCH] atm: iphase: Fix space before '[' error.

2017-10-31 Thread David Miller
From: Arvind Yadav 
Date: Mon, 30 Oct 2017 21:22:03 +0530

> Fix checkpatch.pl error:
> ERROR: space prohibited before open square bracket '['.
> 
> Signed-off-by: Arvind Yadav 

Applied.


Re: [PATCH 2/3] thermal: int340x: processor_thermal: Add Coffee Lake support

2017-10-31 Thread Zhang Rui
On Thu, 2017-10-19 at 14:51 -0700, Srinivas Pandruvada wrote:
> Add new PCI id for Coffee lake processor thermal device.
> 
> Signed-off-by: Srinivas Pandruvada  om>
> ---
>  drivers/thermal/int340x_thermal/processor_thermal_device.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git
> a/drivers/thermal/int340x_thermal/processor_thermal_device.c
> b/drivers/thermal/int340x_thermal/processor_thermal_device.c
> index e724a23..1d9f524 100644
> --- a/drivers/thermal/int340x_thermal/processor_thermal_device.c
> +++ b/drivers/thermal/int340x_thermal/processor_thermal_device.c
> @@ -32,6 +32,7 @@
>  
>  /* CannonLake thermal reporting device */
>  #define PCI_DEVICE_ID_PROC_CNL_THERMAL   0x5a03
> +#define PCI_DEVICE_ID_PROC_CFL_THERMAL   0x3E83
>  
shouldn't it be added into proc_thermal_pci_ids[]?

thanks,
rui
>  /* Braswell thermal reporting device */
>  #define PCI_DEVICE_ID_PROC_BSW_THERMAL   0x22DC


Re: [PATCH net-next 0/6] net: ppv2: various improvements

2017-10-31 Thread David Miller
From: Antoine Tenart 
Date: Mon, 30 Oct 2017 11:23:27 +0100

> This series includes various patches improving the Marvell PPv2 driver.
> I send them as a series to avoid any possible merge conflict.
> 
> - Patches 1 and 2 improve the initializing of the Tx and Rx FIFO.
> - Patch 3 initialize the RSS table to evenly distribute the ingress
>   packets across multiple Rx queues based on their hashes.
> - Patch 4 limits the number of TSO segments sent to the driver, to avoid
>   having more segments to handle than the corresponding number of
>   available descriptors.
> - Patch 5 and 6 are cosmetic improvements.
> 
> This applies on today's net-next branch, The patches were tested
> extensively (I ran iperf and http downloads in parallel, transferring
> TBs of data).

Series applied, thanks.


Re: Kernel crash in free_pipe_info()

2017-10-31 Thread Cong Wang
On Mon, Oct 30, 2017 at 7:08 PM, Linus Torvalds
 wrote:
> On Mon, Oct 30, 2017 at 6:19 PM, Cong Wang  wrote:
>>
>> 1. The faulty addresses are all near 0001, with one exception
>> of null (which is the most recent one)
>
> Well, they're at 8(%rax), except for that last case.
>
> And in every case (_including_ that last case), %rax has a very
> interesting pattern.. That's the (bad) buf->ops pointer that  was
> loaded from the somehow corrupted "buf".
>
> The values in all cases are
>
> fffa
> fffd
> fff1
> fff7
> fff4
> fffa
> fffd
> fffd
> fffa
> ffe8
> fff1
> fff7
>
> which kind of looks like a 32-bit error value. So we have (n, val, (errno)):
>
>   1 -24 (EMFILE)
>   2 -15 (ENOTBLK)
>   1 -12 (ENOMEM)
>   2 -9 (EBADF)
>   3 -6 (ENXIO)
>   3 -3 (ESRCH)
>
> none of which makes any sense to me, but it's an interesting pattern
> nonetheless.


Yeah, good find!


>
>> 2. R12 register, which should map to the local vairable 'i', is always 0x8
>> at the time of crash.
>
> So _if_ this is some kind of use-after-free thing, and the allocation
> got re-used for something else, that might just be related to whatever
> ends up being the offset that is filled in with the (int) error
> number.
>
> Except the offset is that %r12*0x28+0x10, so we're talking a byte
> offset of 330 bytes into the allocation, and apparently the eight
> previous (0-7) iterations were fine.
>
> Which is really odd.
>
> I'm not seeing anything that makes sense. I'll have to think about this.
>
> I'm assuming you don't have slub debugging enabled, and no way to
> enable it and try to catch this?

We enable it at compile-time but not at run-time:

CONFIG_SLUB_DEBUG=y
CONFIG_SLUB=y
CONFIG_SLUB_CPU_PARTIAL=y
# CONFIG_SLUB_DEBUG_ON is not set
# CONFIG_SLUB_STATS is not set

I can try to manually add slub_debug in boot parameters, but still
have no idea how and when can trigger this bug again.


Thanks!


Re: [PATCH] x86, build: Improve the isolinux searching of isoimage generation

2017-10-31 Thread Masahiro Yamada
2017-10-31 18:39 GMT+09:00 Ingo Molnar :
>
> * changbin...@intel.com  wrote:
>
>> From: Changbin Du 
>>
>> Recently I failed to build isoimage target, because the path of isolinux.bin
>> changed to /usr/xxx/ISOLINUX/isolinux.bin, as well as ldlinux.c32 which
>> changed to /usr/xxx/syslinux/modules/bios/ldlinux.c32.
>>
>> This patch has a improvement of the file search:
>>   - Don't print the raw shell commands. It doesn't make sense to show the
>> entire big block.
>>   - Show a error message instead of silent fail.
>>   - Add above new paths.
>>
>> Now it becomes:
>> Kernel: arch/x86/boot/bzImage is ready  (#62)
>> rm -rf arch/x86/boot/isoimage
>> mkdir arch/x86/boot/isoimage
>> Using /usr/lib/ISOLINUX/isolinux.bin
>> Using /usr/lib/syslinux/modules/bios/ldlinux.c32
>> cp arch/x86/boot/bzImage arch/x86/boot/isoimage/linux
>> ...
>>
>> Before:
>> Kernel: arch/x86/boot/bzImage is ready  (#63)
>> rm -rf arch/x86/boot/isoimage
>> mkdir arch/x86/boot/isoimage
>> for i in lib lib64 share end ; do \
>>   if [ -f /usr/$i/syslinux/isolinux.bin ] ; then \
>>   cp /usr/$i/syslinux/isolinux.bin arch/x86/boot/isoimage ; \
>>   if [ -f /usr/$i/syslinux/ldlinux.c32 ]; then \
>>   cp /usr/$i/syslinux/ldlinux.c32 arch/x86/boot/isoimage 
>> ; \
>>   fi ; \
>>   break ; \
>>   fi ; \
>>   if [ $i = end ] ; then exit 1 ; fi ; \
>> done
>> arch/x86/boot/Makefile:161: recipe for target 'isoimage' failed
>> make[1]: *** [isoimage] Error 1
>
> I like these changes. Could we please further improve it: for example the boot
> image build messages are still pretty unstructured, while regular build system
> messages come in the following format:
>
>   CC  arch/x86/events/msr.o
>   RELOCS  arch/x86/realmode/rm/realmode.relocs
>   OBJCOPY arch/x86/realmode/rm/realmode.bin
>   CC  arch/x86/kernel/signal.o
>   AS  arch/x86/realmode/rmpiggy.o
>   CC  ipc/msg.o
>   AR  arch/x86/ia32/built-in.o
>   CC  arch/x86/events/amd/iommu.o
>   CC  init/do_mounts.o
>   AR  arch/x86/realmode/built-in.o
>
> So instead of:
>
>> Kernel: arch/x86/boot/bzImage is ready  (#62)
>> rm -rf arch/x86/boot/isoimage
>> mkdir arch/x86/boot/isoimage
>> Using /usr/lib/ISOLINUX/isolinux.bin
>> Using /usr/lib/syslinux/modules/bios/ldlinux.c32
>> cp arch/x86/boot/bzImage arch/x86/boot/isoimage/linux
>
> Could we make it something more streamlined and similar to the rest of the 
> build
> as well, like:
>
>   GEN arch/x86/boot/bzImage
>   GEN arch/x86/boot/isoimage
>   GEN arch/x86/boot/isoimage/linux
>
> I.e. only mention the new files built, with an appropriate prefix.
>
> I've Cc:-ed the kbuild maintainers, maybe they have a better suggestion 
> instead of
> the 'GEN' abbreviation?
>

Generally, the abbreviation is the tool that has processed the target,
but if you do not find an appropriate one, 'GEN' is fine.




-- 
Best Regards
Masahiro Yamada


Re: [PATCH] net: hns: set correct return value

2017-10-31 Thread David Miller
From: Pan Bian 
Date: Mon, 30 Oct 2017 16:50:01 +0800

> The function of_parse_phandle() returns a NULL pointer if it cannot
> resolve a phandle property to a device_node pointer. In function
> hns_nic_dev_probe(), its return value is passed to PTR_ERR to extract
> the error code. However, in this case, the extracted error code will
> always be zero, which is unexpected.
> 
> Signed-off-by: Pan Bian 

Applied.


linux-next: build warnings after merge of the drm tree

2017-10-31 Thread Stephen Rothwell
Hi Dave,

After merging the drm tree, today's linux-next build (x86_64 allmodconfig)
produced these warnings:

drivers/gpu/drm/vc4/vc4_bo.c: In function 'vc4_bo_stats_debugfs':
drivers/gpu/drm/vc4/vc4_bo.c:91:17: warning: format '%d' expects argument of 
type 'int', but argument 4 has type 'size_t {aka long unsigned int}' [-Wformat=]
   seq_printf(m, "%30s: %6dkb BOs (%d)\n", "userspace BO cache",
 ^
drivers/gpu/drm/vc4/vc4_bo.c:95:17: warning: format '%d' expects argument of 
type 'int', but argument 4 has type 'size_t {aka long unsigned int}' [-Wformat=]
   seq_printf(m, "%30s: %6dkb BOs (%d)\n", "total purged BO",
 ^

Introduced by commit

  b9f19259b84d ("drm/vc4: Add the DRM_IOCTL_VC4_GEM_MADVISE ioctl")

-- 
Cheers,
Stephen Rothwell


Re: net: lapbether: fix double free

2017-10-31 Thread David Miller
From: Pan Bian 
Date: Sun, 29 Oct 2017 21:57:22 +0800

> The function netdev_priv() returns the private data of the device. The
> memory to store the private data is allocated in alloc_netdev() and is
> released in netdev_free(). Calling kfree() on the return value of
> netdev_priv() after netdev_free() results in a double free bug.
> 
> Signed-off-by: Pan Bian 

Applied.


Re: [PATCH] mkiss: remove redundant assignment of len to ax->mtu

2017-10-31 Thread David Miller
From: Colin King 
Date: Sun, 29 Oct 2017 13:30:25 +

> From: Colin Ian King 
> 
> Variable len is being assigned a value that is never read,
> hence the assignment is redundant and can be removed. Cleans
> up clang warning:
> 
> drivers/net/hamradio/mkiss.c:443:3: warning: Value stored to
> 'len' is never read
> 
> Signed-off-by: Colin Ian King 

Applied.


Re: [PATCH v3] dmaengine: rcar-dmac: use TCRB instead of TCR for residue

2017-10-31 Thread Kuninori Morimoto

Hi Geert, Vinod

Geert, thank you for your report,
Vinod, thank you for your quick help.

> > > This is now commit 847449f23dcbff68 ("dmaengine: rcar-dmac: use TCRB
> > > instead of TCR for residue") in slave-dma/next, and breaks serial console
> > > input on koelsch (shmobile_defconfig) and salvator-x (renesas_defconfig).
> > > Reverting that commit fixes the issue for me.

This patch solved my issue (= sound noise), but it is transferring
large size data. From "transferring data size" point of view,
my sound situation is same as your large serial console input situation?

I will ask this to HW guys.
Thanks

Best regards
---
Kuninori Morimoto


Re: [PATCH] net: decnet: dn_nsp_out: use swap macro in dn_mk_ack_header

2017-10-31 Thread David Miller
From: "Gustavo A. R. Silva" 
Date: Sat, 28 Oct 2017 15:39:48 -0500

> Make use of the swap macro and remove unnecessary variable tmp.
> This makes the code easier to read and maintain.
> 
> This code was detected with the help of Coccinelle.
> 
> Signed-off-by: Gustavo A. R. Silva 

Applied.


Re: [PATCH] net: dccp: ccids: lib: packet_history: use swap macro in tfrc_rx_hist_swap

2017-10-31 Thread David Miller
From: "Gustavo A. R. Silva" 
Date: Sat, 28 Oct 2017 15:48:47 -0500

> Make use of the swap macro and remove unnecessary variable tmp.
> This makes the code easier to read and maintain.
> 
> This code was detected with the help of Coccinelle.
> 
> Signed-off-by: Gustavo A. R. Silva 

Applied.


Re: [PATCH] net: decnet: dn_nsp_in: use swap macro in dn_nsp_rx_packet

2017-10-31 Thread David Miller
From: "Gustavo A. R. Silva" 
Date: Sat, 28 Oct 2017 14:38:45 -0500

> Make use of the swap macro and remove unnecessary variable tmp.
> This makes the code easier to read and maintain.
> 
> This code was detected with the help of Coccinelle.
> 
> Signed-off-by: Gustavo A. R. Silva 

Applied.


Re: xfs: list corruption in xfs_setup_inode()

2017-10-31 Thread Dave Chinner
On Tue, Oct 31, 2017 at 06:51:08PM -0700, Cong Wang wrote:
> On Mon, Oct 30, 2017 at 5:33 PM, Dave Chinner  wrote:
> > On Mon, Oct 30, 2017 at 02:55:43PM -0700, Cong Wang wrote:
> >> Hello,
> >>
> >> We triggered a list corruption (double add) warning below on our 4.9
> >> kernel (the 4.9 kernel we use is based on -stable release, with only a
> >> few unrelated networking backports):
...
> >> 4.9.34.el7.x86_64 #1
> >> Hardware name: TYAN S5512/S5512, BIOS V8.B13 03/20/2014
> >>  b0d48a0abb30 8e389f47 b0d48a0abb80 
> >>  b0d48a0abb70 8e08989b 0024 8d9d691e0aa0
> >>  8d9d7a716608 8d9d691e0aa0 4000 8d9d7de6d800
> >> Call Trace:
> >>  [] dump_stack+0x4d/0x66
> >>  [] __warn+0xcb/0xf0
> >>  [] warn_slowpath_fmt+0x5f/0x80
> >>  [] __list_add+0xac/0xb0
> >>  [] inode_sb_list_add+0x3b/0x50
> >>  [] xfs_setup_inode+0x2c/0x170 [xfs]
> >>  [] xfs_ialloc+0x317/0x5c0 [xfs]
> >>  [] xfs_dir_ialloc+0x77/0x220 [xfs]
> >
> > Inode allocation, so should be a new inode straight from the slab
> > cache. THat implies memory corruption of some kind. Please turn on
> > slab poisoning and try to reproduce.
> 
> Are you sure? xfs_iget() seems searching in a cache before allocating
> a new one:

/me sighs

You started with "I don't know the XFS code very well", so I omitted
the complexity of describing about 10 different corner cases where
we /could/ find the unlinked inode still in the cache via the
lookup. But they aren't common cases - the common case in the real
world is allocation of cache cold inodes. IOWs: "so should be a new
inode straight from the slab cache".

So, yes, we could find the old unlinked inode still cached in the
XFS inode cache, but I don't have the time to explain how RCU lookup
code works to everyone who reports a bug.

All you need to understand is that all of this happens below the VFS
and so inodes being reclaimed or newly allocated the in-cache inode
should never, ever be on the VFS sb inode list.

> >>  [] ? down_write+0x12/0x40
> >>  [] xfs_create+0x482/0x760 [xfs]
> >>  [] xfs_generic_create+0x21e/0x2c0 [xfs]
> >>  [] xfs_vn_mknod+0x14/0x20 [xfs]
> >>  [] xfs_vn_mkdir+0x16/0x20 [xfs]
> >>  [] vfs_mkdir+0xe8/0x140
> >>  [] SyS_mkdir+0x7a/0xf0
> >>  [] entry_SYSCALL_64_fastpath+0x13/0x94
> >>
> >> _Without_ looking deeper, it seems this warning could be shut up by:
> >>
> >> --- a/fs/xfs/xfs_icache.c
> >> +++ b/fs/xfs/xfs_icache.c
> >> @@ -1138,6 +1138,8 @@ xfs_reclaim_inode(
> >> xfs_iunlock(ip, XFS_ILOCK_EXCL);
> >>
> >> XFS_STATS_INC(ip->i_mount, xs_ig_reclaims);
> >> +
> >> +   inode_sb_list_del(VFS_I(ip));
> >>
> >> with properly exporting inode_sb_list_del(). Does this make any sense?
> >
> > No, because by this stage the inode has already been removed from
> > the superblock indoe list. Doing this sort of thing here would just
> > paper over whatever the underlying problem might be.
> 
> 
> For me, it looks like the inode in the cache pag->pag_ici_root
> is not removed from sb list before removing from cache.

Sure, we have list corruption. Where we detect that corruption
implies nothing about the cause of the list corruption. The two
events are not connected in any way. Clearing that VFS list here
does nothing to fix the problem causing the list corruption to
occur.

> >> Please let me know if I can provide any other information.
> >
> > How do you reproduce the problem?
> 
> The warning is reported via ABRT email, we don't know what was
> happening at the time of crash.

Which makes it even harder to track down. Perhaps you should
configure the box to crashdump on such a failure and then we
can do some post-failure forensic analysis...

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com


Re: [PATCH v2] tracing: Allocate mask_str buffer dynamically

2017-10-31 Thread Du, Changbin
Hi Rostedt,
On Tue, Oct 31, 2017 at 12:19:58PM -0400, Steven Rostedt wrote:
> On Thu, 26 Oct 2017 00:20:28 +0800
> changbin...@intel.com wrote:
> 
> > From: Changbin Du 
> > 
> > The default NR_CPUS can be very large, but actual possible nr_cpu_ids
> > usually is very small. For my x86 distribution, the NR_CPUS is 8192 and
> > nr_cpu_ids is 4. About 2 pages are wasted.
> > 
> > Most machines don't have so many CPUs, so define a array with NR_CPUS
> > just wastes memory. So let's allocate the buffer dynamically when need.
> > 
> > The exact buffer size should be:
> >   DIV_ROUND_UP(nr_cpu_ids, 4) + nr_cpu_ids/32 + 2;
> > 
> > Example output:
> >   ff,
> > 
> > Signed-off-by: Changbin Du 
> > 
> > ---
> > v2:
> >   - remove 'static' declaration.
> >   - fix buffer size.
> > ---
> >  kernel/trace/trace.c | 18 ++
> >  1 file changed, 10 insertions(+), 8 deletions(-)
> > 
> > diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> > index 752e5da..6b70648 100644
> > --- a/kernel/trace/trace.c
> > +++ b/kernel/trace/trace.c
> > @@ -4184,31 +4184,33 @@ static const struct file_operations 
> > show_traces_fops = {
> >   */
> >  static DEFINE_MUTEX(tracing_cpumask_update_lock);
> 
> The above mutex was used to protect mask_str.
> 
> >  
> > -/*
> > - * Temporary storage for the character representation of the
> > - * CPU bitmask (and one more byte for the newline):
> > - */
> > -static char mask_str[NR_CPUS + 1];
> > -
> >  static ssize_t
> >  tracing_cpumask_read(struct file *filp, char __user *ubuf,
> >  size_t count, loff_t *ppos)
> >  {
> > struct trace_array *tr = file_inode(filp)->i_private;
> > +   char *mask_str;
> > int len;
> >  
> > +   /* Bitmap, ',' and two more bytes for the newline and '\0'. */
> > +   len = DIV_ROUND_UP(nr_cpu_ids, 4) + nr_cpu_ids/32 + 2;
> > +   mask_str = kmalloc(len, GFP_KERNEL);
> > +   if (!mask_str)
> > +   return -ENOMEM;
> > +
> > mutex_lock(&tracing_cpumask_update_lock);
> 
> This patch can remove the mutex as well, since there's no sharing of
> the mask anymore.
> 
> -- Steve
>
ok, let me remove it in v3.

> >  
> > -   len = snprintf(mask_str, count, "%*pb\n",
> > +   len = snprintf(mask_str, len, "%*pb\n",
> >cpumask_pr_args(tr->tracing_cpumask));
> > if (len >= count) {
> > count = -EINVAL;
> > goto out_err;
> > }
> > -   count = simple_read_from_buffer(ubuf, count, ppos, mask_str, NR_CPUS+1);
> > +   count = simple_read_from_buffer(ubuf, count, ppos, mask_str, len);
> >  
> >  out_err:
> > mutex_unlock(&tracing_cpumask_update_lock);
> > +   kfree(mask_str);
> >  
> > return count;
> >  }
> 

-- 
Thanks,
Changbin Du


signature.asc
Description: PGP signature


Re: [PATCH net-next 0/7] net: dsa: add port parsing functions

2017-10-31 Thread David Miller
From: Vivien Didelot 
Date: Fri, 27 Oct 2017 15:55:12 -0400

> This patchset adds port parsing functions called early in the new
> bindings parsing stage, which regroup all the fetching of static data
> available at the port level, including the port's type, name and CPU
> master interface.
> 
> This simplifies the rest of the code which does not need to dig into
> device tree or platform data again in order to check a port's type or
> name.

Series applied, thanks Vivien.


[GIT PULL v2] Thermal SoC management updates for v4.15-rc1 #1

2017-10-31 Thread Eduardo Valentin
Hello Rui,

Please pull the changes for thermal-soc for the coming v4.15-rc1.

Changelog:
- New drivers: Rockchip RV1108 and Broadcom AVS tmon.
- Major rework on HISI driver plus additional support of hisi3660.
- Several fixes on diverse drivers and few in core.

Difference from V1:
- This is now based on v4.14-rc1.

BR,
Eduardo Valentin

The following changes since commit 2bd6bf03f4c1c59381d62c61d03f6cc3fe71f66e:

  Linux 4.14-rc1 (2017-09-16 15:47:51 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal linus

for you to fetch changes up to b2fd708ffa7f43ce8680271924e1771e26a3ec91:

  thermal: cpu_cooling: pr_err() strings should end with newlines (2017-10-31 
19:32:19 -0700)


Allen Wild (1):
  thermal: enable broadcom menu for arm64 bcm2835

Arvind Yadav (1):
  thermal: cpu_cooling: pr_err() strings should end with newlines

Baruch Siach (1):
  thermal: armada: fix formula documentation comment

Brian Norris (2):
  Documentation: devicetree: add binding for Broadcom STB AVS TMON
  thermal: add brcmstb AVS TMON driver

Daniel Lezcano (16):
  thermal/drivers/hisi: Fix missing interrupt enablement
  thermal/drivers/hisi: Remove the multiple sensors support
  thermal/drivers/hisi: Fix kernel panic on alarm interrupt
  thermal/drivers/hisi: Simplify the temperature/step computation
  thermal/drivers/hisi: Fix multiple alarm interrupts firing
  thermal/drivers/hisi: Remove pointless lock
  thermal/drivers/hisi: Encapsulate register writes into helpers
  thermal/drivers/hisi: Fix configuration register setting
  thermal/drivers/hisi: Remove costly sensor inspection
  thermal/drivers/hisi: Rename and remove unused field
  thermal/drivers/hisi: Convert long to int
  thermal/drivers/hisi: Remove thermal data back pointer
  thermal/drivers/hisi: Remove mutex_lock in the code
  thermal/drivers/step_wise: Fix temperature regulation misbehavior
  thermal/drivers/generic-iio-adc: Switch tz request to devm version
  thermal/drivers/qcom-spmi: Use devm_iio_channel_get

Kevin Wangtao (6):
  thermal/drivers/hisi: Move the clk setup in the corresponding functions
  thermal/drivers/hisi: Use round up step value
  thermal/drivers/hisi: Put platform code together
  thermal/drivers/hisi: Add platform prefix to function name
  thermal/drivers/hisi: Prepare to add support for other hisi platforms
  thermal/drivers/hisi: Add support for hi3660 SoC

Nicolin Chen (1):
  thermal: tegra: remove null check for dev pointer

Niklas Söderlund (1):
  thermal: rcar_gen3_thermal: fix initialization sequence for H3 ES2.0

Rocky Hao (2):
  dt-bindings: rockchip-thermal: Support the RV1108 SoC compatible
  thermal: rockchip: Support the RV1108 SoC in thermal driver

Tony Lindgren (1):
  thermal: ti-soc-thermal: Fix ti_thermal_unregister_cpu_cooling NULL 
pointer on unload

 .../devicetree/bindings/thermal/brcm,avs-tmon.txt  |  20 +
 .../bindings/thermal/rockchip-thermal.txt  |   1 +
 MAINTAINERS|   8 +
 drivers/thermal/Kconfig|   2 +-
 drivers/thermal/armada_thermal.c   |   2 +-
 drivers/thermal/broadcom/Kconfig   |   7 +
 drivers/thermal/broadcom/Makefile  |   1 +
 drivers/thermal/broadcom/brcmstb_thermal.c | 387 +
 drivers/thermal/cpu_cooling.c  |   2 +-
 drivers/thermal/hisi_thermal.c | 612 ++---
 drivers/thermal/qcom-spmi-temp-alarm.c |  43 +-
 drivers/thermal/rcar_gen3_thermal.c|  34 +-
 drivers/thermal/rockchip_thermal.c |  67 +++
 drivers/thermal/step_wise.c|  11 +-
 drivers/thermal/tegra/soctherm.c   |   2 +-
 drivers/thermal/thermal-generic-adc.c  |  24 +-
 drivers/thermal/ti-soc-thermal/ti-thermal-common.c |   3 +-
 17 files changed, 940 insertions(+), 286 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/thermal/brcm,avs-tmon.txt
 create mode 100644 drivers/thermal/broadcom/brcmstb_thermal.c


Re: [PATCH 0/2][v2] Add the ability to do BPF directed error injection

2017-10-31 Thread David Miller
From: Alexei Starovoitov 
Date: Tue, 31 Oct 2017 18:58:00 -0700

> i don't think it will apply to anything but net-next. If it goes any
> other tree we will have major conflicts during merge window.
> btw I haven't reviewed them for the second time.

Ok, then I'll need to seem some ACKs from the tracing folks.

Thank you.


[PATCH v4 1/2] dt-bindings: mfd: Add Spreadtrum SC27xx PMIC documentation

2017-10-31 Thread Baolin Wang
This patch adds the binding documentation for Spreadtrum SC27xx series
PMIC device.

Signed-off-by: Baolin Wang 
Acked-by: Rob Herring 
Acked-by: Lee Jones 
---
Changes since v3:
 - No Updates.

Changes since v2:
 - Add acked tag from Rob and Lee.

Changes since v1:
 - Add more documentation to introduce Spreadtrum SC27xx series PMICs.
 - Modify compatile string property.
 - Modify reg property.
 - Remove redundant 'pmic' label.
 - Change 'should be' to 'must be' for cells properties.
---
 .../devicetree/bindings/mfd/sprd,sc27xx-pmic.txt   |   40 
 1 file changed, 40 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/mfd/sprd,sc27xx-pmic.txt

diff --git a/Documentation/devicetree/bindings/mfd/sprd,sc27xx-pmic.txt 
b/Documentation/devicetree/bindings/mfd/sprd,sc27xx-pmic.txt
new file mode 100644
index 000..21b9a89
--- /dev/null
+++ b/Documentation/devicetree/bindings/mfd/sprd,sc27xx-pmic.txt
@@ -0,0 +1,40 @@
+Spreadtrum SC27xx Power Management Integrated Circuit (PMIC)
+
+The Spreadtrum SC27xx series PMICs contain SC2720, SC2721, SC2723, SC2730
+and SC2731. The Spreadtrum PMIC belonging to SC27xx series integrates all
+mobile handset power management, audio codec, battery management and user
+interface support function in a single chip. It has 6 major functional
+blocks:
+- DCDCs to support CPU, memory.
+- LDOs to support both internal and external requirement.
+- Battery management system, such as charger, fuel gauge.
+- Audio codec.
+- User interface function, such as indicator, flash LED and so on.
+- IC level interface, such as power on/off control, RTC and typec and so on.
+
+Required properties:
+- compatible: Should be one of the following:
+   "sprd,sc2720"
+   "sprd,sc2721"
+   "sprd,sc2723"
+   "sprd,sc2730"
+   "sprd,sc2731"
+- reg: The address of the device chip select, should be 0.
+- spi-max-frequency: Typically set to 2600.
+- interrupts: The interrupt line the device is connected to.
+- interrupt-controller: Marks the device node as an interrupt controller.
+- #interrupt-cells: The number of cells to describe an PMIC IRQ, must be 2.
+- #address-cells: Child device offset number of cells, must be 1.
+- #size-cells: Child device size number of cells, must be 0.
+
+Example:
+pmic@0 {
+   compatible = "sprd,sc2731";
+   reg = <0>;
+   spi-max-frequency = <2600>;
+   interrupts = ;
+   interrupt-controller;
+   #interrupt-cells = <2>;
+   #address-cells = <1>;
+   #size-cells = <0>;
+};
-- 
1.7.9.5



[PATCH v4 2/2] mfd: Add Spreadtrum SC27xx series PMICs driver

2017-10-31 Thread Baolin Wang
This patch adds support for Spreadtrum SC27xx series PMIC MFD core, and It
provides communication through the SPI interfaces. The SC27xx series PMICs
contains the following 6 major components:
- DCDCs
- LDOs
- Battery management system
- Audio codec
- User interface function, such as indicator, flash LED
- IC level function, such as power on/off, type-c

Signed-off-by: Baolin Wang 
---
Changes since v3:
 - Use memcpy() to copy register offset address into SPI buffer.

Changes since v2:
 - Add more help information.
 - Define macros for irq base and irq number.
 - Use devm_mfd_add_devices() instead of mfd_add_devices(), which means
 we can remove sprd_pmic_remove().
 - Rename local variables in sprd_pmic_probe().

Changes since v1:
 - Add more documentation to introduce Spreadtrum SC27xx series PMICs.
 - Modify compatile string property.
 - Modify reg property.
 - Remove redundant 'pmic' label.
 - Change 'should be' to 'must be' for cells properties.
---
 drivers/mfd/Kconfig   |   16 +++
 drivers/mfd/Makefile  |1 +
 drivers/mfd/sprd-sc27xx-spi.c |  259 +
 3 files changed, 276 insertions(+)
 create mode 100644 drivers/mfd/sprd-sc27xx-spi.c

diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig
index fc5e4fe..67091bb 100644
--- a/drivers/mfd/Kconfig
+++ b/drivers/mfd/Kconfig
@@ -1057,6 +1057,22 @@ config MFD_SMSC
 To compile this driver as a module, choose M here: the
 module will be called smsc.
 
+config MFD_SC27XX_PMIC
+   tristate "Spreadtrum SC27xx PMICs"
+   depends on ARCH_SPRD || COMPILE_TEST
+   depends on SPI_MASTER
+   select MFD_CORE
+   select REGMAP_SPI
+   select REGMAP_IRQ
+   help
+ This enables support for the Spreadtrum SC27xx PMICs with SPI
+ interface. The SC27xx series PMICs integrate power management,
+ audio codec, battery management and user interface support
+ function (such as RTC, Typec, indicator and so on) in a single chip.
+
+ This driver provides common support for accessing the SC27xx PMICs,
+ and it also adds the irq_chip parts for handling the PMIC chip events.
+
 config ABX500_CORE
bool "ST-Ericsson ABX500 Mixed Signal Circuit register functions"
default y if ARCH_U300 || ARCH_U8500 || COMPILE_TEST
diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile
index c3d0a1b..a377e0f 100644
--- a/drivers/mfd/Makefile
+++ b/drivers/mfd/Makefile
@@ -226,3 +226,4 @@ obj-$(CONFIG_MFD_SUN4I_GPADC)   += sun4i-gpadc.o
 obj-$(CONFIG_MFD_STM32_LPTIMER)+= stm32-lptimer.o
 obj-$(CONFIG_MFD_STM32_TIMERS) += stm32-timers.o
 obj-$(CONFIG_MFD_MXS_LRADC) += mxs-lradc.o
+obj-$(CONFIG_MFD_SC27XX_PMIC)  += sprd-sc27xx-spi.o
diff --git a/drivers/mfd/sprd-sc27xx-spi.c b/drivers/mfd/sprd-sc27xx-spi.c
new file mode 100644
index 000..56a4782
--- /dev/null
+++ b/drivers/mfd/sprd-sc27xx-spi.c
@@ -0,0 +1,259 @@
+/*
+ * Copyright (C) 2017 Spreadtrum Communications Inc.
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define SPRD_PMIC_INT_MASK_STATUS  0x0
+#define SPRD_PMIC_INT_RAW_STATUS   0x4
+#define SPRD_PMIC_INT_EN   0x8
+
+#define SPRD_SC2731_IRQ_BASE   0x140
+#define SPRD_SC2731_IRQ_NUMS   16
+
+struct sprd_pmic {
+   struct regmap *regmap;
+   struct device *dev;
+   struct regmap_irq *irqs;
+   struct regmap_irq_chip irq_chip;
+   struct regmap_irq_chip_data *irq_data;
+   int irq;
+};
+
+struct sprd_pmic_data {
+   u32 irq_base;
+   u32 num_irqs;
+};
+
+/*
+ * Since different PMICs of SC27xx series can have different interrupt
+ * base address and irq number, we should save irq number and irq base
+ * in the device data structure.
+ */
+static const struct sprd_pmic_data sc2731_data = {
+   .irq_base = SPRD_SC2731_IRQ_BASE,
+   .num_irqs = SPRD_SC2731_IRQ_NUMS,
+};
+
+static const struct mfd_cell sprd_pmic_devs[] = {
+   {
+   .name = "sc27xx-wdt",
+   .of_compatible = "sprd,sc27xx-wdt",
+   }, {
+   .name = "sc27xx-rtc",
+   .of_compatible = "sprd,sc27xx-rtc",
+   }, {
+   .name = "sc27xx-charger",
+   .of_compatible = "sprd,sc27xx-charger",
+   }, {
+   .name = "sc27xx-chg-timer",
+   .of_compatible = "sprd,sc27xx-chg-timer",
+   }, {
+   .name = "sc27xx-fast-chg",
+   .of_compatible = "sprd,sc27

Re: [PATCH v2 1/2] KVM: X86: Fix operand size during instruction decoding

2017-10-31 Thread Pedro Fonseca
This patch fixes the problem and passes my tests on the CS.DB field, 
this includes the "push es" and the "high 16-bits of ESP" test cases.


Tested-by: Pedro Fonseca 

On 10/27/17 1:36 AM, Wanpeng Li wrote:

From: Wanpeng Li 

Pedro reported:
   During tests that we conducted on KVM, we noticed that executing a "PUSH %ES"
   instruction under KVM produces different results on both memory and the SP
   register depending on whether EPT support is enabled. With EPT the SP is
   reduced by 4 bytes (and the written value is 0-padded) but without EPT 
support
   it is only reduced by 2 bytes. The difference can be observed when the CS.DB
   field is 1 (32-bit) but not when it's 0 (16-bit).

The internal segment descriptor cache exist even in real/vm8096 mode. The CS.D
also should be respected instead of just default operand-size/66H prefix during
instruction decoding. This patch fixes it by also adjusting operand-size 
according
to CS.D.

Reported-by: Pedro Fonseca 
Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Cc: Nadav Amit 
Cc: Pedro Fonseca 
Signed-off-by: Wanpeng Li 
---
v1 -> v2:
  * respect cs.d for real/vm8096, other modes have already
been considered in init_emulate_ctxt().

  arch/x86/kvm/emulate.c | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 8079d14..6ebc4cb 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -5000,6 +5000,8 @@ int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void 
*insn, int insn_len)
bool op_prefix = false;
bool has_seg_override = false;
struct opcode opcode;
+   u16 dummy;
+   struct desc_struct desc;
  
  	ctxt->memop.type = OP_NONE;

ctxt->memopp = NULL;
@@ -5020,6 +5022,11 @@ int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void 
*insn, int insn_len)
case X86EMUL_MODE_VM86:
case X86EMUL_MODE_PROT16:
def_op_bytes = def_ad_bytes = 2;
+   if (mode == X86EMUL_MODE_REAL || mode == X86EMUL_MODE_VM86) {
+   ctxt->ops->get_segment(ctxt, &dummy, &desc, NULL, 
VCPU_SREG_CS);
+   if (desc.d)
+   def_op_bytes = 4;
+   }
break;
case X86EMUL_MODE_PROT32:
def_op_bytes = def_ad_bytes = 4;




Re: [GIT PULL] Thermal SoC management updates for v4.15-rc1 #1

2017-10-31 Thread Eduardo Valentin
Hey,

On Wed, Nov 01, 2017 at 10:29:46AM +0800, Zhang Rui wrote:
> On Mon, 2017-10-30 at 07:30 -0700, Eduardo Valentin wrote:
> > Hello Rui,
> > 
> > Please pull the changes for thermal-soc for the coming v4.15-rc1.
> > Changelog:
> > - New drivers: Rockchip RV1108 and Broadcom AVS tmon.
> > - Major rework on HISI driver plus additional support of hisi3660.
> > - Several fixes on diverse drivers and few in core.
> > 
> > BR,
> > 
> > The following changes since commit
> > 569dbb88e80deb68974ef6fdd6a13edb9d686261:
> > 
> >   Linux 4.13 (2017-09-03 13:56:17 -0700)
> > 
> please rebase it on top of v4.14-rc1 to avoid conflict, as we have a
> couple of thermal soc changes merged in 4.14-rc1.

sure, let me see what I can do. I will send a refresh of the pull
in short. The rebase has no interesting conflict.

> 
> thanks,
> rui
> 
> > are available in the git repository at:
> > 
> >  
> > 
> > for you to fetch changes up to
> > 877a9aa9dadc7291b0069fb2ccdf2bbc1e3e6a6e:
> > 
> >   thermal: cpu_cooling: pr_err() strings should end with newlines
> > (2017-10-26 11:33:32 -0700)
> > 
> > 
> > Allen Wild (1):
> >   thermal: enable broadcom menu for arm64 bcm2835
> > 
> > Arvind Yadav (1):
> >   thermal: cpu_cooling: pr_err() strings should end with newlines
> > 
> > Baruch Siach (1):
> >   thermal: armada: fix formula documentation comment
> > 
> > Brian Norris (2):
> >   Documentation: devicetree: add binding for Broadcom STB AVS
> > TMON
> >   thermal: add brcmstb AVS TMON driver
> > 
> > Daniel Lezcano (16):
> >   thermal/drivers/hisi: Fix missing interrupt enablement
> >   thermal/drivers/hisi: Remove the multiple sensors support
> >   thermal/drivers/hisi: Fix kernel panic on alarm interrupt
> >   thermal/drivers/hisi: Simplify the temperature/step computation
> >   thermal/drivers/hisi: Fix multiple alarm interrupts firing
> >   thermal/drivers/hisi: Remove pointless lock
> >   thermal/drivers/hisi: Encapsulate register writes into helpers
> >   thermal/drivers/hisi: Fix configuration register setting
> >   thermal/drivers/hisi: Remove costly sensor inspection
> >   thermal/drivers/hisi: Rename and remove unused field
> >   thermal/drivers/hisi: Convert long to int
> >   thermal/drivers/hisi: Remove thermal data back pointer
> >   thermal/drivers/hisi: Remove mutex_lock in the code
> >   thermal/drivers/step_wise: Fix temperature regulation
> > misbehavior
> >   thermal/drivers/generic-iio-adc: Switch tz request to devm
> > version
> >   thermal/drivers/qcom-spmi: Use devm_iio_channel_get
> > 
> > Kevin Wangtao (6):
> >   thermal/drivers/hisi: Move the clk setup in the corresponding
> > functions
> >   thermal/drivers/hisi: Use round up step value
> >   thermal/drivers/hisi: Put platform code together
> >   thermal/drivers/hisi: Add platform prefix to function name
> >   thermal/drivers/hisi: Prepare to add support for other hisi
> > platforms
> >   thermal/drivers/hisi: Add support for hi3660 SoC
> > 
> > Nicolin Chen (1):
> >   thermal: tegra: remove null check for dev pointer
> > 
> > Niklas Söderlund (1):
> >   thermal: rcar_gen3_thermal: fix initialization sequence for H3
> > ES2.0
> > 
> > Rocky Hao (2):
> >   dt-bindings: rockchip-thermal: Support the RV1108 SoC
> > compatible
> >   thermal: rockchip: Support the RV1108 SoC in thermal driver
> > 
> > Tony Lindgren (1):
> >   thermal: ti-soc-thermal: Fix ti_thermal_unregister_cpu_cooling
> > NULL pointer on unload
> > 
> >  .../devicetree/bindings/thermal/brcm,avs-tmon.txt  |  20 +
> >  .../bindings/thermal/rockchip-thermal.txt  |   1 +
> >  MAINTAINERS|   8 +
> >  drivers/thermal/Kconfig|   2 +-
> >  drivers/thermal/armada_thermal.c   |   2 +-
> >  drivers/thermal/broadcom/Kconfig   |   7 +
> >  drivers/thermal/broadcom/Makefile  |   1 +
> >  drivers/thermal/broadcom/brcmstb_thermal.c | 387
> > +
> >  drivers/thermal/cpu_cooling.c  |   2 +-
> >  drivers/thermal/hisi_thermal.c | 612
> > ++---
> >  drivers/thermal/qcom-spmi-temp-alarm.c |  43 +-
> >  drivers/thermal/rcar_gen3_thermal.c|  34 +-
> >  drivers/thermal/rockchip_thermal.c |  67 +++
> >  drivers/thermal/step_wise.c|  11 +-
> >  drivers/thermal/tegra/soctherm.c   |   2 +-
> >  drivers/thermal/thermal-generic-adc.c  |  24 +-
> >  drivers/thermal/ti-soc-thermal/ti-thermal-common.c |   3 +-
> >  17 files changed, 940 insertions(+), 286 deletions(-)
> >  create mode 100644
> > Documentation/devicetree/bindings/thermal/brcm,avs-tmon.txt
> >  create mode 100644 drivers/thermal/broadcom/brcmstb_thermal.c


Re: [GIT PULL] Thermal SoC management updates for v4.15-rc1 #1

2017-10-31 Thread Zhang Rui
On Mon, 2017-10-30 at 07:30 -0700, Eduardo Valentin wrote:
> Hello Rui,
> 
> Please pull the changes for thermal-soc for the coming v4.15-rc1.
> Changelog:
> - New drivers: Rockchip RV1108 and Broadcom AVS tmon.
> - Major rework on HISI driver plus additional support of hisi3660.
> - Several fixes on diverse drivers and few in core.
> 
> BR,
> 
> The following changes since commit
> 569dbb88e80deb68974ef6fdd6a13edb9d686261:
> 
>   Linux 4.13 (2017-09-03 13:56:17 -0700)
> 
please rebase it on top of v4.14-rc1 to avoid conflict, as we have a
couple of thermal soc changes merged in 4.14-rc1.

thanks,
rui

> are available in the git repository at:
> 
>  
> 
> for you to fetch changes up to
> 877a9aa9dadc7291b0069fb2ccdf2bbc1e3e6a6e:
> 
>   thermal: cpu_cooling: pr_err() strings should end with newlines
> (2017-10-26 11:33:32 -0700)
> 
> 
> Allen Wild (1):
>   thermal: enable broadcom menu for arm64 bcm2835
> 
> Arvind Yadav (1):
>   thermal: cpu_cooling: pr_err() strings should end with newlines
> 
> Baruch Siach (1):
>   thermal: armada: fix formula documentation comment
> 
> Brian Norris (2):
>   Documentation: devicetree: add binding for Broadcom STB AVS
> TMON
>   thermal: add brcmstb AVS TMON driver
> 
> Daniel Lezcano (16):
>   thermal/drivers/hisi: Fix missing interrupt enablement
>   thermal/drivers/hisi: Remove the multiple sensors support
>   thermal/drivers/hisi: Fix kernel panic on alarm interrupt
>   thermal/drivers/hisi: Simplify the temperature/step computation
>   thermal/drivers/hisi: Fix multiple alarm interrupts firing
>   thermal/drivers/hisi: Remove pointless lock
>   thermal/drivers/hisi: Encapsulate register writes into helpers
>   thermal/drivers/hisi: Fix configuration register setting
>   thermal/drivers/hisi: Remove costly sensor inspection
>   thermal/drivers/hisi: Rename and remove unused field
>   thermal/drivers/hisi: Convert long to int
>   thermal/drivers/hisi: Remove thermal data back pointer
>   thermal/drivers/hisi: Remove mutex_lock in the code
>   thermal/drivers/step_wise: Fix temperature regulation
> misbehavior
>   thermal/drivers/generic-iio-adc: Switch tz request to devm
> version
>   thermal/drivers/qcom-spmi: Use devm_iio_channel_get
> 
> Kevin Wangtao (6):
>   thermal/drivers/hisi: Move the clk setup in the corresponding
> functions
>   thermal/drivers/hisi: Use round up step value
>   thermal/drivers/hisi: Put platform code together
>   thermal/drivers/hisi: Add platform prefix to function name
>   thermal/drivers/hisi: Prepare to add support for other hisi
> platforms
>   thermal/drivers/hisi: Add support for hi3660 SoC
> 
> Nicolin Chen (1):
>   thermal: tegra: remove null check for dev pointer
> 
> Niklas Söderlund (1):
>   thermal: rcar_gen3_thermal: fix initialization sequence for H3
> ES2.0
> 
> Rocky Hao (2):
>   dt-bindings: rockchip-thermal: Support the RV1108 SoC
> compatible
>   thermal: rockchip: Support the RV1108 SoC in thermal driver
> 
> Tony Lindgren (1):
>   thermal: ti-soc-thermal: Fix ti_thermal_unregister_cpu_cooling
> NULL pointer on unload
> 
>  .../devicetree/bindings/thermal/brcm,avs-tmon.txt  |  20 +
>  .../bindings/thermal/rockchip-thermal.txt  |   1 +
>  MAINTAINERS|   8 +
>  drivers/thermal/Kconfig|   2 +-
>  drivers/thermal/armada_thermal.c   |   2 +-
>  drivers/thermal/broadcom/Kconfig   |   7 +
>  drivers/thermal/broadcom/Makefile  |   1 +
>  drivers/thermal/broadcom/brcmstb_thermal.c | 387
> +
>  drivers/thermal/cpu_cooling.c  |   2 +-
>  drivers/thermal/hisi_thermal.c | 612
> ++---
>  drivers/thermal/qcom-spmi-temp-alarm.c |  43 +-
>  drivers/thermal/rcar_gen3_thermal.c|  34 +-
>  drivers/thermal/rockchip_thermal.c |  67 +++
>  drivers/thermal/step_wise.c|  11 +-
>  drivers/thermal/tegra/soctherm.c   |   2 +-
>  drivers/thermal/thermal-generic-adc.c  |  24 +-
>  drivers/thermal/ti-soc-thermal/ti-thermal-common.c |   3 +-
>  17 files changed, 940 insertions(+), 286 deletions(-)
>  create mode 100644
> Documentation/devicetree/bindings/thermal/brcm,avs-tmon.txt
>  create mode 100644 drivers/thermal/broadcom/brcmstb_thermal.c


Re: [PATCH V4 08/12] boot_constraint: Manage deferrable constraints

2017-10-31 Thread Viresh Kumar
On 31 October 2017 at 16:20, Rob Herring  wrote:
> What is the effect on boot time? It's highly platform dependent, but
> the worst case could be pretty bad I think.

Yeah, it can increase considerably here and I have plans for that, just
that i didn't wanted to get them in the first iteration to keep things simple.

What we can (should?) do is, that the boot constraint framework can hook
into other frameworks like regulators/clk/PM, etc, so that whenever a new
clk/regulator is added to those frameworks, they check for pending
requests from boot constraint framework. If found, they can call a callback
of the boot constraint framework which will set the constraints to the resource
before anyone else gets a chance. At that point we can remove the early
defer probing support that this patch is adding. And things would be quite fast
then.

> I don't see how this handles the case you mentioned where the amba
> pclk gets disabled. It only works if the constraint device is added
> before any others, but that is done with initcall level games.

Yeah, so as I said earlier, the basic idea is that these constraints must get
set before any user driver (for constrained devices) comes up. And the
only way to do that is by making sure the constraints get added at early
initcall levels. The same is done for all the three example drivers I have
added.

The amba-pclk thing isn't a issue then, as that stuff happens only at probe
and not when the amba device is created.

--
viresh


Re: [PATCH 2/3] fpga: manager: don't use drvdata in common fpga code

2017-10-31 Thread Alan Tull
On Tue, Oct 31, 2017 at 8:34 PM, Moritz Fischer  wrote:
> On Tue, Oct 31, 2017 at 04:45:54PM -0500, Alan Tull wrote:
>> On Tue, Oct 31, 2017 at 3:59 PM, Moritz Fischer  wrote:
>> > On Tue, Oct 31, 2017 at 08:42:14PM +, Alan Tull wrote:
>> >> Changes to the fpga manager code to not use drvdata in common
>> >> code.
>> >>
>> >> Change fpga_mgr_register to not set or use drvdata.
>> >>
>> >> Change the register/unregister function parameters to take the mgr
>> >> struct:
>> >> * int fpga_mgr_register(struct device *dev,
>> >> struct fpga_manager *mgr);
>> >> * void fpga_mgr_unregister(struct fpga_manager *mgr);
>> >>
>> >> Change the drivers that call fpga_mgr_register to alloc the struct
>> >> fpga_manager (using devm_kzalloc) and partly fill it, adding name,
>> >> ops, and priv.
>> >>
>> >> The rationale is that setting drvdata is fine for DT based devices
>> >> that will have one manager, bridge, or region per platform device.
>> >> However PCIe based devices may have multiple FPGA mgr/bridge/regions
>> >> under one pcie device.  Without these changes, the PCIe solution has
>> >> to create an extra device for each child mgr/bridge/region to hold
>> >> drvdata.
>> >
>> > This looks very common, in fact several subsystems provide this two step
>> > way of registering things.
>> >
>> > - Allocate fpga_mgr via fpga_mgr_alloc() where you pass in the size of
>> >   the private data
>> > - Fill in some fields
>> > - fpga_mgr_register() where you pass in the fpga_mgr as suggested
>> >
>> > See for example the alloc_etherdev() for ethernet devices.
>> >
>>
>> Yes, I considered it when I was writing this.  I've seen it both ways.
>> In the case you mention, they've got reasons they absolutely needed to
>> do it that way.  alloc_etherdev() calls eventually to
>> alloc_netdev_mqs() which is about 100 lines of alloc'ing and
>> initializing a network device struct.
>
> Yeah, sure. I looked around some more. Other subsystems just alloc
> manually, seems fine with me.
>>
>> > The benefit of the API you proposed is that one could embed the fpga_mgr
>> > struct inside of another struct and get to the container via
>> > container_of() I guess ...
>>
>> Yes, let's keep it simple for now, as that gives us greater
>> flexibility.  We can add alloc functions when it becomes clear that it
>> won't get in the way of someone's use.
>
> Agreed.
>>
>> Alan
>>
>> >
>> >>
>> >> Signed-off-by: Alan Tull 
>> >> Reported-by: Jiuyue Ma 
>> >> ---
>> >>  Documentation/fpga/fpga-mgr.txt  | 23 ---
>> >>  drivers/fpga/altera-cvp.c| 17 +
>> >>  drivers/fpga/altera-pr-ip-core.c | 16 ++--
>> >>  drivers/fpga/altera-ps-spi.c | 17 ++---
>> >>  drivers/fpga/fpga-mgr.c  | 28 +++-
>> >>  drivers/fpga/ice40-spi.c | 19 +++
>> >>  drivers/fpga/socfpga-a10.c   | 15 ---
>> >>  drivers/fpga/socfpga.c   | 17 ++---
>> >>  drivers/fpga/ts73xx-fpga.c   | 17 ++---
>> >>  drivers/fpga/xilinx-spi.c| 17 ++---
>> >>  drivers/fpga/zynq-fpga.c | 15 ---
>> >>  include/linux/fpga/fpga-mgr.h|  6 ++
>> >>  12 files changed, 147 insertions(+), 60 deletions(-)
>> >>
>> >> diff --git a/Documentation/fpga/fpga-mgr.txt 
>> >> b/Documentation/fpga/fpga-mgr.txt
>> >> index cc6413e..7e5e5c8 100644
>> >> --- a/Documentation/fpga/fpga-mgr.txt
>> >> +++ b/Documentation/fpga/fpga-mgr.txt
>> >> @@ -67,11 +67,9 @@ fpga_mgr_unlock when done programming the FPGA.
>> >>  To register or unregister the low level FPGA-specific driver:
>> >>  -
>> >>
>> >> - int fpga_mgr_register(struct device *dev, const char *name,
>> >> -   const struct fpga_manager_ops *mops,
>> >> -   void *priv);
>> >> + int fpga_mgr_register(struct device *dev, struct fpga_manager *mgr);
>
> At that point you could also just give the struct fpga_manager a
> ->parent or ->dev that you populate with &pdev->dev or &spi->dev etc instead 
> of
> making it a separate parameter, this makes an odd mix of half and half here.

Yes, I'd have to call it parent as dev is taken.  I also noticed that
I forgot to edit the function parameter documentation in the c file.

>> >>
>> >> - void fpga_mgr_unregister(struct device *dev);
>> >> + void fpga_mgr_unregister(struct fpga_manager *mgr);
>> >>
>> >>  Use of these two functions is described below in "How To Support a new 
>> >> FPGA
>> >>  device."
>> >> @@ -148,8 +146,13 @@ static int socfpga_fpga_probe(struct platform_device 
>> >> *pdev)
>> >>  {
>> >>   struct device *dev = &pdev->dev;
>> >>   struct socfpga_fpga_priv *priv;
>> >> + struct fpga_manager *mgr;
>> >>   int ret;
>> >>
>> >> + mgr = devm_kzalloc(dev, sizeof(*mgr), GFP_KERNEL);
>> >> + if (!mgr)
>> >> + return -E

Re: [PATCH v1] MAINTAINERS: Step down from a co-maintaner of DW DMAC driver

2017-10-31 Thread Viresh Kumar
On 31 October 2017 at 12:56, Andy Shevchenko
 wrote:
> As discussed at ELCE 2017 there is little to anticipate from me in the
> future with regard to the driver, and since I have many things to keep
> an eye on, I would like to step down to simple designated reviewer.
>
> Signed-off-by: Andy Shevchenko 
> ---
>  MAINTAINERS | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 0630482e701b..4ad7f4598ff2 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -12986,7 +12986,7 @@ F:  
> Documentation/devicetree/bindings/gpio/snps-dwapb-gpio.txt
>
>  SYNOPSYS DESIGNWARE DMAC DRIVER
>  M: Viresh Kumar 
> -M: Andy Shevchenko 
> +R: Andy Shevchenko 
>  S: Maintained
>  F: include/linux/dma/dw.h
>  F: include/linux/platform_data/dma-dw.h

Acked-by: Viresh Kumar 


Re: [RFC V7 2/2] OPP: Allow "opp-hz" and "opp-microvolt" to contain magic values

2017-10-31 Thread Viresh Kumar
On 31 October 2017 at 16:02, Rob Herring  wrote:
> Why not a new property for magic values? opp-magic? Don't we want to
> know when we have magic values?

I have kept a separate property since beginning (domain-performance-state)
and moved to using these magic values in the existing field because of the
suggestion Kevin gave earlier.

https://marc.info/?l=linux-kernel&m=149306082218001&w=2

I am not sure what to do now :)

> Wouldn't magic values in opp-hz get propagated to user space?

The OPP core puts them in debugfs just to know how the OPPs are
set. Otherwise, I am not sure that the power domain core/drivers would
be exposing that to user space.

> I can
> see the complaints now. "My 4GHz processor is running at 6Hz!" Just
> like people complain when BogoMIPS is not high enough.

Hmm, would adding the right user type in bindings would be good enough
for that? I mean, we can clearly add that only power-domains that are
managed by firmware are allowed to use magic values here. But not sure
if we should put such stuff in bindings.

--
viresh


[PATCH v2] kprobes, x86/alternatives: use text_mutex to protect smp_alt_modules

2017-10-31 Thread Zhou Chengming
Changes:
- Add a comment about text_mutex protecting this on x86.

Fixes: 2cfa197 "ftrace/alternatives: Introducing *_text_reserved
functions"

We use alternatives_text_reserved() to check if the address is in
the fixed pieces of alternative reserved, but the problem is that
we don't hold the smp_alt mutex when call this function. So the list
traversal may encounter a deleted list_head if another path is doing
alternatives_smp_module_del().

One solution is that we can hold smp_alt mutex before call this
function, but the difficult point is that the callers of this
functions, arch_prepare_kprobe() and arch_prepare_optimized_kprobe(),
are called inside the text_mutex. So we must hold smp_alt mutex
before we go into these arch dependent code. But we can't now,
the smp_alt mutex is the arch dependent part, only x86 has it.
Maybe we can export another arch dependent callback to solve this.

But there is a simpler way to handle this problem. We can reuse the
text_mutex to protect smp_alt_modules instead of using another mutex.
And all the arch dependent checks of kprobes are inside the text_mutex,
so it's safe now.

Reviewed-by: Masami Hiramatsu 
Signed-off-by: Zhou Chengming 
---
 arch/x86/kernel/alternative.c | 24 +++-
 kernel/extable.c  |  2 ++
 2 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 3344d33..55abbaa 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -442,7 +442,6 @@ static void alternatives_smp_lock(const s32 *start, const 
s32 *end,
 {
const s32 *poff;
 
-   mutex_lock(&text_mutex);
for (poff = start; poff < end; poff++) {
u8 *ptr = (u8 *)poff + *poff;
 
@@ -452,7 +451,6 @@ static void alternatives_smp_lock(const s32 *start, const 
s32 *end,
if (*ptr == 0x3e)
text_poke(ptr, ((unsigned char []){0xf0}), 1);
}
-   mutex_unlock(&text_mutex);
 }
 
 static void alternatives_smp_unlock(const s32 *start, const s32 *end,
@@ -460,7 +458,6 @@ static void alternatives_smp_unlock(const s32 *start, const 
s32 *end,
 {
const s32 *poff;
 
-   mutex_lock(&text_mutex);
for (poff = start; poff < end; poff++) {
u8 *ptr = (u8 *)poff + *poff;
 
@@ -470,7 +467,6 @@ static void alternatives_smp_unlock(const s32 *start, const 
s32 *end,
if (*ptr == 0xf0)
text_poke(ptr, ((unsigned char []){0x3E}), 1);
}
-   mutex_unlock(&text_mutex);
 }
 
 struct smp_alt_module {
@@ -489,8 +485,7 @@ struct smp_alt_module {
struct list_head next;
 };
 static LIST_HEAD(smp_alt_modules);
-static DEFINE_MUTEX(smp_alt);
-static bool uniproc_patched = false;   /* protected by smp_alt */
+static bool uniproc_patched = false;   /* protected by text_mutex */
 
 void __init_or_module alternatives_smp_module_add(struct module *mod,
  char *name,
@@ -499,7 +494,7 @@ void __init_or_module alternatives_smp_module_add(struct 
module *mod,
 {
struct smp_alt_module *smp;
 
-   mutex_lock(&smp_alt);
+   mutex_lock(&text_mutex);
if (!uniproc_patched)
goto unlock;
 
@@ -526,14 +521,14 @@ void __init_or_module alternatives_smp_module_add(struct 
module *mod,
 smp_unlock:
alternatives_smp_unlock(locks, locks_end, text, text_end);
 unlock:
-   mutex_unlock(&smp_alt);
+   mutex_unlock(&text_mutex);
 }
 
 void __init_or_module alternatives_smp_module_del(struct module *mod)
 {
struct smp_alt_module *item;
 
-   mutex_lock(&smp_alt);
+   mutex_lock(&text_mutex);
list_for_each_entry(item, &smp_alt_modules, next) {
if (mod != item->mod)
continue;
@@ -541,7 +536,7 @@ void __init_or_module alternatives_smp_module_del(struct 
module *mod)
kfree(item);
break;
}
-   mutex_unlock(&smp_alt);
+   mutex_unlock(&text_mutex);
 }
 
 void alternatives_enable_smp(void)
@@ -551,7 +546,7 @@ void alternatives_enable_smp(void)
/* Why bother if there are no other CPUs? */
BUG_ON(num_possible_cpus() == 1);
 
-   mutex_lock(&smp_alt);
+   mutex_lock(&text_mutex);
 
if (uniproc_patched) {
pr_info("switching to SMP code\n");
@@ -563,10 +558,13 @@ void alternatives_enable_smp(void)
  mod->text, mod->text_end);
uniproc_patched = false;
}
-   mutex_unlock(&smp_alt);
+   mutex_unlock(&text_mutex);
 }
 
-/* Return 1 if the address range is reserved for smp-alternatives */
+/*
+ * Return 1 if the address range is reserved for smp-alternatives.
+ * Must hold text_mutex.
+ */
 int alternatives_text_reserved(void *start, void *end)
 {
struct smp_alt_module *mod;
diff --git a/kernel/extable.c b/kernel/extable.c
index 9aa1cc4..ec64cf

Re: [PATCH] f2fs: don't bother with inode->i_version

2017-10-31 Thread Chao Yu
On 2017/10/30 23:11, Jeff Layton wrote:
> From: Jeff Layton 
> 
> f2fs does not set the SB_I_VERSION flag, so the i_version will never
> be incremented on write. It was recently changed to increment the
> i_version on a quota write, which isn't necessary here.
> 
> Signed-off-by: Jeff Layton 

Reviewed-by: Chao Yu 

Thanks,

> ---
>  fs/f2fs/super.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> index 933c3d529e65..b3359158e7be 100644
> --- a/fs/f2fs/super.c
> +++ b/fs/f2fs/super.c
> @@ -618,7 +618,6 @@ static struct inode *f2fs_alloc_inode(struct super_block 
> *sb)
>   init_once((void *) fi);
>  
>   /* Initialize f2fs-specific inode info */
> - fi->vfs_inode.i_version = 1;
>   atomic_set(&fi->dirty_pages, 0);
>   fi->i_current_depth = 1;
>   fi->i_advise = 0;
> @@ -1386,7 +1385,6 @@ static ssize_t f2fs_quota_write(struct super_block *sb, 
> int type,
>  
>   if (len == towrite)
>   return 0;
> - inode->i_version++;
>   inode->i_mtime = inode->i_ctime = current_time(inode);
>   f2fs_mark_inode_dirty_sync(inode, false);
>   return len - towrite;
> 



Re: [PATCH] hwmon: (aspeed-pwm-tacho) Deassert reset in probe

2017-10-31 Thread Guenter Roeck

On 10/31/2017 07:04 PM, Stafford Horne wrote:

On Tue, Oct 31, 2017 at 06:53:15PM -0700, Guenter Roeck wrote:

On 10/31/2017 06:34 PM, Joel Stanley wrote:

The ASPEED SoC must deassert a reset in order to use the PWM/tach
peripheral.

The device tree bindings are updated to document the resets phandle, and
the example is updated to match what is expected for both the reset and
clock phandle. Note that the bindings should have always had the reset
controller, as the hardware is unusable without it.

Signed-off-by: Joel Stanley 


Presumably the driver is being used. This change makes it incompatible with
existing users. This is unacceptable; after all, it is possible that the
device is taken out of reset by ROMMON or BIOS.

On top of that, the reset controller code is quite strict and issues a
backtrace if CONFIG_RESET_CONTROLLER is not enabled. Yet, there is no
dependency added on RESET_CONTROLLER. You might want to consider making
the new control optional and using devm_reset_control_get_optional_exclusive().

The DT change should be a separate patch.

More comments below.


[..]


return PTR_ERR_OR_ZERO(hwmon);
   }
+static int aspeed_pwm_tacho_remove(struct platform_device *pdev)
+{
+   struct aspeed_pwm_tacho_data *priv = platform_get_drvdata(pdev);
+
+   reset_control_deassert(priv->rst);


This seems to be quite pointless. Also, did you test this code ?


+
+   return 0;
+}
+
   static const struct of_device_id of_pwm_tacho_match_table[] = {
{ .compatible = "aspeed,ast2400-pwm-tacho", },
{ .compatible = "aspeed,ast2500-pwm-tacho", },
@@ -969,6 +989,7 @@ MODULE_DEVICE_TABLE(of, of_pwm_tacho_match_table);
   static struct platform_driver aspeed_pwm_tacho_driver = {
.probe  = aspeed_pwm_tacho_probe,
+   .probe  = aspeed_pwm_tacho_remove,


Also, this cant be right (should be .remove)?



Nice. Makes me really wonder what this code would do. Does this even compile ?

Guenter


[PATCH v2 2/2] mfd: syscon: Add hardware spinlock support

2017-10-31 Thread Baolin Wang
Some system control registers need hardware spinlock to synchronize
between the multiple subsystems, so we should add hardware spinlock
support for syscon.

Signed-off-by: Baolin Wang 
---
Changes since v1:
 - Remove timeout configuration.
 - Modify the binding file to add hwlocks.
---
 Documentation/devicetree/bindings/mfd/syscon.txt |1 +
 drivers/mfd/syscon.c |7 +++
 2 files changed, 8 insertions(+)

diff --git a/Documentation/devicetree/bindings/mfd/syscon.txt 
b/Documentation/devicetree/bindings/mfd/syscon.txt
index 408f768..3120fdf 100644
--- a/Documentation/devicetree/bindings/mfd/syscon.txt
+++ b/Documentation/devicetree/bindings/mfd/syscon.txt
@@ -16,6 +16,7 @@ Required properties:
 Optional property:
 - reg-io-width: the size (in bytes) of the IO accesses that should be
   performed on the device.
+- hwlocks: reference to a phandle of a hardware spinlock provider node.
 
 Examples:
 gpr: iomuxc-gpr@020e {
diff --git a/drivers/mfd/syscon.c b/drivers/mfd/syscon.c
index b93fe4c..f1dccce 100644
--- a/drivers/mfd/syscon.c
+++ b/drivers/mfd/syscon.c
@@ -13,6 +13,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -87,6 +88,12 @@ static struct syscon *of_syscon_register(struct device_node 
*np)
if (ret)
reg_io_width = 4;
 
+   ret = of_hwspin_lock_get_id(np, 0);
+   if (ret > 0) {
+   syscon_config.hwlock_id = ret;
+   syscon_config.hwlock_mode = HWLOCK_IRQSTATE;
+   }
+
syscon_config.reg_stride = reg_io_width;
syscon_config.val_bits = reg_io_width * 8;
syscon_config.max_register = resource_size(&res) - reg_io_width;
-- 
1.7.9.5



[PATCH v2 1/2] regmap: Add hardware spinlock support

2017-10-31 Thread Baolin Wang
On some platforms, when reading or writing some special registers through
regmap, we should acquire one hardware spinlock to synchronize between
the multiple subsystems. Thus this patch adds the hardware spinlock
support for regmap.

Signed-off-by: Baolin Wang 
---
Changes since v1:
 - Move hwspinlock.h including to regmap.c file.
 - Remove hwspin_trylock_xxx() functions.
 - We always set the timeout as maximum values to avoid getting lock failed.
 - Coding style fixes.
---
 drivers/base/regmap/internal.h |2 +
 drivers/base/regmap/regmap.c   |  101 +---
 include/linux/regmap.h |6 +++
 3 files changed, 93 insertions(+), 16 deletions(-)

diff --git a/drivers/base/regmap/internal.h b/drivers/base/regmap/internal.h
index 2a4435d..8641183 100644
--- a/drivers/base/regmap/internal.h
+++ b/drivers/base/regmap/internal.h
@@ -157,6 +157,8 @@ struct regmap {
 
struct rb_root range_tree;
void *selector_work_buf;/* Scratch buffer used for selector */
+
+   struct hwspinlock *hwlock;
 };
 
 struct regcache_ops {
diff --git a/drivers/base/regmap/regmap.c b/drivers/base/regmap/regmap.c
index b9a779a..999e981 100644
--- a/drivers/base/regmap/regmap.c
+++ b/drivers/base/regmap/regmap.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define CREATE_TRACE_POINTS
 #include "trace.h"
@@ -413,6 +414,49 @@ static unsigned int regmap_parse_64_native(const void *buf)
 }
 #endif
 
+static void regmap_lock_hwlock(void *__map)
+{
+   struct regmap *map = __map;
+
+   hwspin_lock_timeout(map->hwlock, UINT_MAX);
+}
+
+static void regmap_lock_hwlock_irq(void *__map)
+{
+   struct regmap *map = __map;
+
+   hwspin_lock_timeout_irq(map->hwlock, UINT_MAX);
+}
+
+static void regmap_lock_hwlock_irqsave(void *__map)
+{
+   struct regmap *map = __map;
+
+   hwspin_lock_timeout_irqsave(map->hwlock, UINT_MAX,
+   &map->spinlock_flags);
+}
+
+static void regmap_unlock_hwlock(void *__map)
+{
+   struct regmap *map = __map;
+
+   hwspin_unlock(map->hwlock);
+}
+
+static void regmap_unlock_hwlock_irq(void *__map)
+{
+   struct regmap *map = __map;
+
+   hwspin_unlock_irq(map->hwlock);
+}
+
+static void regmap_unlock_hwlock_irqrestore(void *__map)
+{
+   struct regmap *map = __map;
+
+   hwspin_unlock_irqrestore(map->hwlock, &map->spinlock_flags);
+}
+
 static void regmap_lock_mutex(void *__map)
 {
struct regmap *map = __map;
@@ -627,6 +671,29 @@ struct regmap *__regmap_init(struct device *dev,
map->lock = config->lock;
map->unlock = config->unlock;
map->lock_arg = config->lock_arg;
+   } else if (config->hwlock_id) {
+   map->hwlock = hwspin_lock_request_specific(config->hwlock_id);
+   if (!map->hwlock) {
+   ret = -ENXIO;
+   goto err_map;
+   }
+
+   switch (config->hwlock_mode) {
+   case HWLOCK_IRQSTATE:
+   map->lock = regmap_lock_hwlock_irqsave;
+   map->unlock = regmap_unlock_hwlock_irqrestore;
+   break;
+   case HWLOCK_IRQ:
+   map->lock = regmap_lock_hwlock_irq;
+   map->unlock = regmap_unlock_hwlock_irq;
+   break;
+   default:
+   map->lock = regmap_lock_hwlock;
+   map->unlock = regmap_unlock_hwlock;
+   break;
+   }
+
+   map->lock_arg = map;
} else {
if ((bus && bus->fast_io) ||
config->fast_io) {
@@ -729,7 +796,7 @@ struct regmap *__regmap_init(struct device *dev,
map->format.format_write = regmap_format_2_6_write;
break;
default:
-   goto err_map;
+   goto err_hwlock;
}
break;
 
@@ -739,7 +806,7 @@ struct regmap *__regmap_init(struct device *dev,
map->format.format_write = regmap_format_4_12_write;
break;
default:
-   goto err_map;
+   goto err_hwlock;
}
break;
 
@@ -749,7 +816,7 @@ struct regmap *__regmap_init(struct device *dev,
map->format.format_write = regmap_format_7_9_write;
break;
default:
-   goto err_map;
+   goto err_hwlock;
}
break;
 
@@ -759,7 +826,7 @@ struct regmap *__regmap_init(struct device *dev,
map->format.format_write = regmap_format_10_14_write;
break;
default:
-   goto err_map;
+   goto err_hwlock;
   

[BUG] tty: Userland can create hung tasks

2017-10-31 Thread Tejun Heo
Hello,

tty hangup code doesn't mark the console as being HUPed for, e.g.,
/dev/console and that can put the session leader trying to
disassociate from the controlling terminal in an indefinite D sleep.

Looking at the code, I have no idea why some tty devices are never
marked being hung up.  It *looks* intentional and dates back to the
git origin but I couldn't find any clue.  The following patch is a
workaround which fixes the observed problem but definitely isn't the
proper fix.

For details, please read the patch description.  If you scroll down,
there's a reproducer too.

Thanks.

-- 8< --
Subject: [PATCH] tty: make n_tty_read() always abort if hangup is in progress

__tty_hangup() sets file->f_op to hung_up_tty_fops iff the write
operation is tty_write(), which means that, for example, hanging up
/dev/console doesn't clear its f_op as its write op is
redirected_tty_write().

tty_hung_up_p() tests f_op for hung_up_tty_fops to determine whether
the terminal is (being) hung up.  In turn, n_tty_read() uses this test
to decide whether readers should abort due to hangup.

Combined, this means that n_tty_read() can't tell whether /dev/console
is being hung up or not.  This can lead to the following scenario.

 1. A session contains two processes.  The leader and its child.  The
child ignores SIGHUP.

 2. The leader exits and starts disassociating from the controlling
terminal (/dev/console).

 3. __tty_hangup() skips setting f_op to hung_up_tty_fops.

 4. SIGHUP is delivered and ignored.

 5. tty_ldisc_hangup() is invoked.  It wakes up the waits which should
clear the read lockers of tty->ldisc_sem.

 6. The reader wakes up but because tty_hung_up_p() is false, it
doesn't abort and goes back to sleep while read-holding
tty->ldisc_sem.

 7. The leader progresses to tty_ldisc_lock() in tty_ldisc_hangup()
and is now stuck in D sleep indefinitely waiting for
tty->ldisc_sem.

This leads to hung task warnings like the following.

  [  492.713289] INFO: task agetty:2662 blocked for more than 120 seconds.
  [  492.726170]   Not tainted 4.11.3-dbg-tty-lockup-02478-gfd6c7ee-dirty 
#28
  [  492.740264] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [  492.755919] 0  2662  1 0x0086
  [  492.763940] Call Trace:
  [  492.768834]  __schedule+0x267/0x890
  [  492.775816]  schedule+0x36/0x80
  [  492.782094]  schedule_timeout+0x23c/0x2e0
  [  492.790120]  ldsem_down_write+0xce/0x1f6
  [  492.797974]  tty_ldisc_lock+0x16/0x30
  [  492.805300]  tty_ldisc_hangup+0xb3/0x1b0
  [  492.813143]  __tty_hangup+0x300/0x410
  [  492.820470]  disassociate_ctty+0x6c/0x290
  [  492.828486]  do_exit+0x7ef/0xb00
  [  492.834946]  do_group_exit+0x3f/0xa0
  [  492.842092]  get_signal+0x1b3/0x5d0
  [  492.849077]  do_signal+0x28/0x660
  [  492.855720]  ? __fput+0x174/0x1e0
  [  492.862353]  ? __audit_syscall_exit+0x1f3/0x280
  [  492.871402]  exit_to_usermode_loop+0x46/0x86
  [  492.879926]  do_syscall_64+0x9c/0xb0
  [  492.887073]  entry_SYSCALL64_slow_path+0x25/0x25
  [  492.896295] RIP: 0033:0x7f69b3e7f783
  [  492.903438] RSP: 002b:7ffdcb249ca8 EFLAGS: 0246
  [  492.913868]  ORIG_RAX: 0017
  [  492.921536] RAX: fdfe RBX: 7ffdcb249cd0 RCX: 
7f69b3e7f783
  [  492.935786] RDX:  RSI: 7ffdcb249da0 RDI: 
0005
  [  492.950034] RBP: 7ffdcb249e20 R08:  R09: 
7ffdcb249c60
  [  492.964284] R10:  R11: 0246 R12: 

  [  492.964285] R13: 0004 R14: 0100 R15: 
7ffdcb24b750

This patch works around the issue by marking that hangup is in
progress in tty->flags regardless of the tty type and make
n_tty_read() abort accordingly.  This isn't a proper fix but does work
around the observed problem.

The following is the repro.  Run "$PROG /dev/console".  The parent
process hangs in D state.

  #include 
  #include 
  #include 
  #include 
  #include 
  #include 
  #include 
  #include 
  #include 
  #include 
  #include 
  #include 

  int main(int argc, char **argv)
  {
  struct sigaction sact = { .sa_handler = SIG_IGN };
  struct timespec ts1s = { .tv_sec = 1 };
  pid_t pid;
  int fd;

  if (argc < 2) {
  fprintf(stderr, "test-hung-tty /dev/$TTY\n");
  return 1;
  }

  /* fork a child to ensure that it isn't already the session leader */
  pid = fork();
  if (pid < 0) {
  perror("fork");
  return 1;
  }

  if (pid > 0) {
  /* top parent, wait for everyone */
  while (waitpid(-1, NULL, 0) >= 0)
  ;
  if (errno != ECHILD)
  perror("waitpid");
  return 0;
  }

  /* new session, start a new session and set the controlling tty */
  if (se

Re: KASAN: use-after-free in move_expired_inodes

2017-10-31 Thread Shankara Pailoor
Hi Al, etc,

I was unable to find a reproducer but I was looking at
move_expired_inodes (fs/fs-writeback.c 1093.c) and how do you ensure
that the inode can't be freed after retrieving it from the work queue?
Any insights would be appreciated.

Regards,
Shankara

On Tue, Oct 31, 2017 at 9:24 AM, Shankara Pailoor  wrote:
> Hi,
>
> We got the following error:
>
> BUG: KASAN: use-after-free in move_expired_inodes+0xce6/0xdf0
> Write of size 8 at addr 8800a3a36bf8 by task kworker/u8:0/5
>
> while fuzzing with Syzkaller on 4.14-rc4 on x86_64. Included is the
> trace of the crash along with the programs running around the time of
> the crash.
>
> Programs can be found here: https://pastebin.com/RYGtNn3z
>
> Stack trace here: https://pastebin.com/SaJXWMg3
>
> We don't have a C reproducer but we will send one if we have it.
>
> Regards,
> Shankara



Re: [PATCH] hwmon: (aspeed-pwm-tacho) Deassert reset in probe

2017-10-31 Thread Stafford Horne
On Tue, Oct 31, 2017 at 06:53:15PM -0700, Guenter Roeck wrote:
> On 10/31/2017 06:34 PM, Joel Stanley wrote:
> > The ASPEED SoC must deassert a reset in order to use the PWM/tach
> > peripheral.
> > 
> > The device tree bindings are updated to document the resets phandle, and
> > the example is updated to match what is expected for both the reset and
> > clock phandle. Note that the bindings should have always had the reset
> > controller, as the hardware is unusable without it.
> > 
> > Signed-off-by: Joel Stanley 
> 
> Presumably the driver is being used. This change makes it incompatible with
> existing users. This is unacceptable; after all, it is possible that the
> device is taken out of reset by ROMMON or BIOS.
> 
> On top of that, the reset controller code is quite strict and issues a
> backtrace if CONFIG_RESET_CONTROLLER is not enabled. Yet, there is no
> dependency added on RESET_CONTROLLER. You might want to consider making
> the new control optional and using 
> devm_reset_control_get_optional_exclusive().
> 
> The DT change should be a separate patch.
> 
> More comments below.

[..]

> > return PTR_ERR_OR_ZERO(hwmon);
> >   }
> > +static int aspeed_pwm_tacho_remove(struct platform_device *pdev)
> > +{
> > +   struct aspeed_pwm_tacho_data *priv = platform_get_drvdata(pdev);
> > +
> > +   reset_control_deassert(priv->rst);
> 
> This seems to be quite pointless. Also, did you test this code ?
> 
> > +
> > +   return 0;
> > +}
> > +
> >   static const struct of_device_id of_pwm_tacho_match_table[] = {
> > { .compatible = "aspeed,ast2400-pwm-tacho", },
> > { .compatible = "aspeed,ast2500-pwm-tacho", },
> > @@ -969,6 +989,7 @@ MODULE_DEVICE_TABLE(of, of_pwm_tacho_match_table);
> >   static struct platform_driver aspeed_pwm_tacho_driver = {
> > .probe  = aspeed_pwm_tacho_probe,
> > +   .probe  = aspeed_pwm_tacho_remove,

Also, this cant be right (should be .remove)?

> > .driver = {
> > .name   = "aspeed_pwm_tacho",
> > .of_match_table = of_pwm_tacho_match_table,
> > 
> 


Re: [PATCH 0/2][v2] Add the ability to do BPF directed error injection

2017-10-31 Thread Alexei Starovoitov

On 10/31/17 6:55 PM, David Miller wrote:

From: Josef Bacik 
Date: Tue, 31 Oct 2017 11:45:55 -0400


v1->v2:
- moved things around to make sure that bpf_override_return could really only be
  used for an ftrace kprobe.
- killed the special return values from trace_call_bpf.
- renamed pc_modified to bpf_kprobe_state so bpf_override_return could tell if
  it was being called from an ftrace kprobe context.
- reworked the logic in kprobe_perf_func to take advantage of bpf_kprobe_state.
- updated the test as per Alexei's review.

A lot of our error paths are not well tested because we have no good way of
injecting errors generically.  Some subystems (block, memory) have ways to
inject errors, but they are random so it's hard to get reproduceable results.

With BPF we can add determinism to our error injection.  We can use kprobes and
other things to verify we are injecting errors at the exact case we are trying
to test.  This patch gives us the tool to actual do the error injection part.
It is very simple, we just set the return value of the pt_regs we're given to
whatever we provide, and then override the PC with a dummy function that simply
returns.

Right now this only works on x86, but it would be simple enough to expand to
other architectures.  Thanks,


This appears to moreso target the tracing tree than the networking tree.

Let me know if that's not the case and I should be the one intergrating
these changes.


i don't think it will apply to anything but net-next. If it goes any
other tree we will have major conflicts during merge window.
btw I haven't reviewed them for the second time.



[PATCH 1/8] ls1043ardb: add qe node to ls1043ardb

2017-10-31 Thread Zhao Qiang
Signed-off-by: Zhao Qiang 
---
 arch/arm64/boot/dts/freescale/fsl-ls1043a-rdb.dts | 16 ++
 arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi| 66 +++
 2 files changed, 82 insertions(+)

diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1043a-rdb.dts 
b/arch/arm64/boot/dts/freescale/fsl-ls1043a-rdb.dts
index c37110b..8f23f39 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls1043a-rdb.dts
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1043a-rdb.dts
@@ -132,6 +132,22 @@
};
 };
 
+&uqe {
+   ucc_hdlc: ucc@2000 {
+   compatible = "fsl,ucc-hdlc";
+   rx-clock-name = "clk8";
+   tx-clock-name = "clk9";
+   fsl,rx-sync-clock = "rsync_pin";
+   fsl,tx-sync-clock = "tsync_pin";
+   fsl,tx-timeslot-mask = <0xfffe>;
+   fsl,rx-timeslot-mask = <0xfffe>;
+   fsl,tdm-framer-type = "e1";
+   fsl,tdm-id = <0>;
+   fsl,siram-entry-id = <0>;
+   fsl,tdm-interface;
+   };
+};
+
 &duart0 {
status = "okay";
 };
diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi 
b/arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi
index ec13a6e..38e8e2b 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi
@@ -474,6 +474,72 @@
#interrupt-cells = <2>;
};
 
+   uqe: uqe@240 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   device_type = "qe";
+   compatible = "fsl,qe", "simple-bus";
+   ranges = <0x0 0x0 0x240 0x4>;
+   reg = <0x0 0x240 0x0 0x480>;
+   brg-frequency = <1>;
+   bus-frequency = <2>;
+
+   fsl,qe-num-riscs = <1>;
+   fsl,qe-num-snums = <28>;
+
+   qeic: qeic@80 {
+   compatible = "fsl,qe-ic";
+   reg = <0x80 0x80>;
+   #address-cells = <0>;
+   interrupt-controller;
+   #interrupt-cells = <1>;
+   interrupts = <0 77 0x04 0 77 0x04>;
+   };
+
+   si1: si@700 {
+   #address-cells = <1>;
+   #size-cells = <0>;
+   compatible = "fsl,ls1043-qe-si",
+   "fsl,t1040-qe-si";
+   reg = <0x700 0x80>;
+   };
+
+   siram1: siram@1000 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   compatible = "fsl,ls1043-qe-siram",
+   "fsl,t1040-qe-siram";
+   reg = <0x1000 0x800>;
+   };
+
+   ucc@2000 {
+   cell-index = <1>;
+   reg = <0x2000 0x200>;
+   interrupts = <32>;
+   interrupt-parent = <&qeic>;
+   };
+
+   ucc@2200 {
+   cell-index = <3>;
+   reg = <0x2200 0x200>;
+   interrupts = <34>;
+   interrupt-parent = <&qeic>;
+   };
+
+   muram@1 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   compatible = "fsl,qe-muram", "fsl,cpm-muram";
+   ranges = <0x0 0x1 0x6000>;
+
+   data-only@0 {
+   compatible = "fsl,qe-muram-data",
+   "fsl,cpm-muram-data";
+   reg = <0x0 0x6000>;
+   };
+   };
+   };
+
lpuart0: serial@295 {
compatible = "fsl,ls1021a-lpuart";
reg = <0x0 0x295 0x0 0x1000>;
-- 
2.1.0.27.g96db324



[Patch v11 3/4] irqchip/qeic: remove PPCisms for QEIC

2017-10-31 Thread Zhao Qiang
QEIC was supported on PowerPC, and dependent on PPC,
Now it is supported on other platforms, so remove PPCisms.

Signed-off-by: Zhao Qiang 
---
 arch/powerpc/platforms/83xx/km83xx.c  |   1 -
 arch/powerpc/platforms/83xx/misc.c|   1 -
 arch/powerpc/platforms/83xx/mpc832x_mds.c |   1 -
 arch/powerpc/platforms/83xx/mpc832x_rdb.c |   1 -
 arch/powerpc/platforms/83xx/mpc836x_mds.c |   1 -
 arch/powerpc/platforms/83xx/mpc836x_rdk.c |   1 -
 arch/powerpc/platforms/85xx/corenet_generic.c |   1 -
 arch/powerpc/platforms/85xx/mpc85xx_mds.c |   1 -
 arch/powerpc/platforms/85xx/mpc85xx_rdb.c |   1 -
 arch/powerpc/platforms/85xx/twr_p102x.c   |   1 -
 drivers/irqchip/irq-qeic.c| 188 +++---
 include/soc/fsl/qe/qe_ic.h| 132 --
 12 files changed, 80 insertions(+), 250 deletions(-)
 delete mode 100644 include/soc/fsl/qe/qe_ic.h

diff --git a/arch/powerpc/platforms/83xx/km83xx.c 
b/arch/powerpc/platforms/83xx/km83xx.c
index d8642a4afc74..b1cef0ac5507 100644
--- a/arch/powerpc/platforms/83xx/km83xx.c
+++ b/arch/powerpc/platforms/83xx/km83xx.c
@@ -38,7 +38,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "mpc83xx.h"
 
diff --git a/arch/powerpc/platforms/83xx/misc.c 
b/arch/powerpc/platforms/83xx/misc.c
index c09a13532c89..07a0e6128ad2 100644
--- a/arch/powerpc/platforms/83xx/misc.c
+++ b/arch/powerpc/platforms/83xx/misc.c
@@ -17,7 +17,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 
diff --git a/arch/powerpc/platforms/83xx/mpc832x_mds.c 
b/arch/powerpc/platforms/83xx/mpc832x_mds.c
index bb7b25acf26f..a1cadf4a695b 100644
--- a/arch/powerpc/platforms/83xx/mpc832x_mds.c
+++ b/arch/powerpc/platforms/83xx/mpc832x_mds.c
@@ -37,7 +37,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "mpc83xx.h"
 
diff --git a/arch/powerpc/platforms/83xx/mpc832x_rdb.c 
b/arch/powerpc/platforms/83xx/mpc832x_rdb.c
index a4539c5accb0..2fb1464a02a7 100644
--- a/arch/powerpc/platforms/83xx/mpc832x_rdb.c
+++ b/arch/powerpc/platforms/83xx/mpc832x_rdb.c
@@ -26,7 +26,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 
diff --git a/arch/powerpc/platforms/83xx/mpc836x_mds.c 
b/arch/powerpc/platforms/83xx/mpc836x_mds.c
index 4fc3051c2b2e..9234d635f5ca 100644
--- a/arch/powerpc/platforms/83xx/mpc836x_mds.c
+++ b/arch/powerpc/platforms/83xx/mpc836x_mds.c
@@ -45,7 +45,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "mpc83xx.h"
 
diff --git a/arch/powerpc/platforms/83xx/mpc836x_rdk.c 
b/arch/powerpc/platforms/83xx/mpc836x_rdk.c
index 93f024fd9b45..82fa344a9125 100644
--- a/arch/powerpc/platforms/83xx/mpc836x_rdk.c
+++ b/arch/powerpc/platforms/83xx/mpc836x_rdk.c
@@ -21,7 +21,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 
diff --git a/arch/powerpc/platforms/85xx/corenet_generic.c 
b/arch/powerpc/platforms/85xx/corenet_generic.c
index 1b385acbf4dd..9ca27b1403ce 100644
--- a/arch/powerpc/platforms/85xx/corenet_generic.c
+++ b/arch/powerpc/platforms/85xx/corenet_generic.c
@@ -27,7 +27,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include 
 #include 
diff --git a/arch/powerpc/platforms/85xx/mpc85xx_mds.c 
b/arch/powerpc/platforms/85xx/mpc85xx_mds.c
index 06f34a99152e..8102e5f7cb98 100644
--- a/arch/powerpc/platforms/85xx/mpc85xx_mds.c
+++ b/arch/powerpc/platforms/85xx/mpc85xx_mds.c
@@ -49,7 +49,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include "smp.h"
diff --git a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c 
b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
index 000d385933af..f806b6bbf3a3 100644
--- a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
+++ b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
@@ -27,7 +27,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include 
 #include 
diff --git a/arch/powerpc/platforms/85xx/twr_p102x.c 
b/arch/powerpc/platforms/85xx/twr_p102x.c
index 6be9b337035a..4f620f2f9f64 100644
--- a/arch/powerpc/platforms/85xx/twr_p102x.c
+++ b/arch/powerpc/platforms/85xx/twr_p102x.c
@@ -23,7 +23,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include 
 #include 
diff --git a/drivers/irqchip/irq-qeic.c b/drivers/irqchip/irq-qeic.c
index a2d808410220..26bfcbdc1d35 100644
--- a/drivers/irqchip/irq-qeic.c
+++ b/drivers/irqchip/irq-qeic.c
@@ -18,8 +18,11 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -27,9 +30,8 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
-#include 
 
 #define NR_QE_IC_INTS  64
 
@@ -87,6 +89,43 @@
 #define SIGNAL_HIGH2
 #define SIGNAL_LOW 0
 
+#define NUM_OF_QE_IC_GROUPS6
+
+/* Flags when we init the QE IC */
+#define QE_IC_SPREADMODE_GRP_W 0x0001
+#define QE_IC_SPREADMODE_GRP_X 0x0002
+#define QE_IC_SPREADMODE_GRP_Y 0x0004
+#define QE_IC_SPREADMODE_GRP_Z 0x0008
+#d

[Patch v11 2/4] irqchip/qeic: merge qeic_of_init into qe_ic_init

2017-10-31 Thread Zhao Qiang
qeic_of_init just get device_node of qeic from dtb and call qe_ic_init,
pass the device_node to qe_ic_init.
So merge qeic_of_init into qe_ic_init to get the qeic node in
qe_ic_init.

Signed-off-by: Zhao Qiang 
---
 drivers/irqchip/irq-qeic.c | 90 --
 include/soc/fsl/qe/qe_ic.h |  7 
 2 files changed, 39 insertions(+), 58 deletions(-)

diff --git a/drivers/irqchip/irq-qeic.c b/drivers/irqchip/irq-qeic.c
index 8287c22d954a..a2d808410220 100644
--- a/drivers/irqchip/irq-qeic.c
+++ b/drivers/irqchip/irq-qeic.c
@@ -407,27 +407,33 @@ unsigned int qe_ic_get_high_irq(struct qe_ic *qe_ic)
return irq_linear_revmap(qe_ic->irqhost, irq);
 }
 
-void __init qe_ic_init(struct device_node *node, unsigned int flags,
-  void (*low_handler)(struct irq_desc *desc),
-  void (*high_handler)(struct irq_desc *desc))
+static int __init qe_ic_init(struct device_node *node, unsigned int flags)
 {
struct qe_ic *qe_ic;
struct resource res;
-   u32 temp = 0, ret, high_active = 0;
+   u32 temp = 0, high_active = 0;
+   int ret = 0;
+
+   if (!node)
+   return -ENODEV;
 
ret = of_address_to_resource(node, 0, &res);
-   if (ret)
-   return;
+   if (ret) {
+   ret = -ENODEV;
+   goto err_put_node;
+   }
 
qe_ic = kzalloc(sizeof(*qe_ic), GFP_KERNEL);
-   if (qe_ic == NULL)
-   return;
+   if (qe_ic == NULL) {
+   ret = -ENOMEM;
+   goto err_put_node;
+   }
 
qe_ic->irqhost = irq_domain_add_linear(node, NR_QE_IC_INTS,
   &qe_ic_host_ops, qe_ic);
if (qe_ic->irqhost == NULL) {
-   kfree(qe_ic);
-   return;
+   ret = -ENOMEM;
+   goto err_free_qe_ic;
}
 
qe_ic->regs = ioremap(res.start, resource_size(&res));
@@ -438,9 +444,9 @@ void __init qe_ic_init(struct device_node *node, unsigned 
int flags,
qe_ic->virq_low = irq_of_parse_and_map(node, 1);
 
if (qe_ic->virq_low == NO_IRQ) {
-   printk(KERN_ERR "Failed to map QE_IC low IRQ\n");
-   kfree(qe_ic);
-   return;
+   pr_err("Failed to map QE_IC low IRQ\n");
+   ret = -ENOMEM;
+   goto err_domain_remove;
}
 
/* default priority scheme is grouped. If spread mode is*/
@@ -467,13 +473,24 @@ void __init qe_ic_init(struct device_node *node, unsigned 
int flags,
qe_ic_write(qe_ic->regs, QEIC_CICR, temp);
 
irq_set_handler_data(qe_ic->virq_low, qe_ic);
-   irq_set_chained_handler(qe_ic->virq_low, low_handler);
+   irq_set_chained_handler(qe_ic->virq_low, qe_ic_cascade_low_mpic);
 
if (qe_ic->virq_high != NO_IRQ &&
qe_ic->virq_high != qe_ic->virq_low) {
irq_set_handler_data(qe_ic->virq_high, qe_ic);
-   irq_set_chained_handler(qe_ic->virq_high, high_handler);
+   irq_set_chained_handler(qe_ic->virq_high,
+   qe_ic_cascade_high_mpic);
}
+   of_node_put(node);
+   return 0;
+
+err_domain_remove:
+   irq_domain_remove(qe_ic->irqhost);
+err_free_qe_ic:
+   kfree(qe_ic);
+err_put_node:
+   of_node_put(node);
+   return ret;
 }
 
 void qe_ic_set_highest_priority(unsigned int virq, int high)
@@ -570,45 +587,16 @@ int qe_ic_set_high_priority(unsigned int virq, unsigned 
int priority, int high)
return 0;
 }
 
-static struct bus_type qe_ic_subsys = {
-   .name = "qe_ic",
-   .dev_name = "qe_ic",
-};
-
-static struct device device_qe_ic = {
-   .id = 0,
-   .bus = &qe_ic_subsys,
-};
-
-static int __init init_qe_ic_sysfs(void)
+static int __init init_qe_ic(struct device_node *node,
+struct device_node *parent)
 {
-   int rc;
-
-   printk(KERN_DEBUG "Registering qe_ic with sysfs...\n");
+   int ret;
 
-   rc = subsys_system_register(&qe_ic_subsys, NULL);
-   if (rc) {
-   printk(KERN_ERR "Failed registering qe_ic sys class\n");
-   return -ENODEV;
-   }
-   rc = device_register(&device_qe_ic);
-   if (rc) {
-   printk(KERN_ERR "Failed registering qe_ic sys device\n");
-   return -ENODEV;
-   }
-   return 0;
-}
+   ret = qe_ic_init(node, 0);
+   if (ret)
+   return ret;
 
-static int __init qeic_of_init(struct device_node *node,
-  struct device_node *parent)
-{
-   if (!node)
-   return -ENODEV;
-   qe_ic_init(node, 0, qe_ic_cascade_low_mpic,
-  qe_ic_cascade_high_mpic);
-   of_node_put(node);
return 0;
 }
 
-IRQCHIP_DECLARE(qeic, "fsl,qe-ic", qeic_of_init);
-subsys_initcall(init_qe_ic_sysfs);
+IRQCHIP_DECLARE(qeic, "fsl,qe-ic", init_qe_ic);
diff --gi

[Patch v11 4/4] QE: remove PPCisms for QE

2017-10-31 Thread Zhao Qiang
QE was supported on PowerPC, and dependent on PPC,
Now it is supported on other platforms. so remove PPCisms.

Signed-off-by: Zhao Qiang 
---
 drivers/net/ethernet/freescale/Kconfig | 11 ++---
 drivers/soc/fsl/qe/Kconfig |  2 +-
 drivers/soc/fsl/qe/qe.c| 82 +-
 drivers/soc/fsl/qe/qe_io.c | 42 -
 drivers/soc/fsl/qe/qe_tdm.c|  8 ++--
 drivers/soc/fsl/qe/ucc.c   | 10 ++---
 drivers/soc/fsl/qe/ucc_fast.c  | 74 +++---
 drivers/tty/serial/Kconfig |  2 +-
 drivers/tty/serial/ucc_uart.c  |  1 +
 drivers/usb/gadget/udc/Kconfig |  2 +-
 drivers/usb/host/Kconfig   |  2 +-
 include/soc/fsl/qe/qe.h|  1 -
 12 files changed, 126 insertions(+), 111 deletions(-)

diff --git a/drivers/net/ethernet/freescale/Kconfig 
b/drivers/net/ethernet/freescale/Kconfig
index 6e490fd2345d..015bdb829d18 100644
--- a/drivers/net/ethernet/freescale/Kconfig
+++ b/drivers/net/ethernet/freescale/Kconfig
@@ -5,10 +5,11 @@
 config NET_VENDOR_FREESCALE
bool "Freescale devices"
default y
-   depends on FSL_SOC || QUICC_ENGINE || CPM1 || CPM2 || PPC_MPC512x || \
-  M523x || M527x || M5272 || M528x || M520x || M532x || \
-  ARCH_MXC || ARCH_MXS || (PPC_MPC52xx && PPC_BESTCOMM) || \
-  ARCH_LAYERSCAPE || COMPILE_TEST
+   depends on FSL_SOC || (QUICC_ENGINE && PPC32) || CPM1 || CPM2 || \
+  PPC_MPC512x || M523x || M527x || M5272 || M528x || M520x || \
+  M532x || ARCH_MXC || ARCH_MXS || \
+  (PPC_MPC52xx && PPC_BESTCOMM) || ARCH_LAYERSCAPE || \
+  COMPILE_TEST
---help---
  If you have a network (Ethernet) card belonging to this class, say Y.
 
@@ -73,7 +74,7 @@ config FSL_XGMAC_MDIO
 
 config UCC_GETH
tristate "Freescale QE Gigabit Ethernet"
-   depends on QUICC_ENGINE
+   depends on QUICC_ENGINE && FSL_SOC && PPC32
select FSL_PQ_MDIO
select PHYLIB
---help---
diff --git a/drivers/soc/fsl/qe/Kconfig b/drivers/soc/fsl/qe/Kconfig
index 73a2e08b47ef..b26b64319d67 100644
--- a/drivers/soc/fsl/qe/Kconfig
+++ b/drivers/soc/fsl/qe/Kconfig
@@ -4,7 +4,7 @@
 
 config QUICC_ENGINE
bool "Freescale QUICC Engine (QE) Support"
-   depends on FSL_SOC && PPC32
+   depends on OF && HAS_IOMEM
select GENERIC_ALLOCATOR
select CRC32
help
diff --git a/drivers/soc/fsl/qe/qe.c b/drivers/soc/fsl/qe/qe.c
index 2ef6fc6487c1..1d695870ea9e 100644
--- a/drivers/soc/fsl/qe/qe.c
+++ b/drivers/soc/fsl/qe/qe.c
@@ -33,8 +33,6 @@
 #include 
 #include 
 #include 
-#include 
-#include 
 
 static void qe_snums_init(void);
 static int qe_sdma_init(void);
@@ -107,15 +105,27 @@ void qe_reset(void)
panic("sdma init failed!");
 }
 
+/* issue commands to QE, return 0 on success while -EIO on error
+ *
+ * @cmd: the command code, should be QE_INIT_TX_RX, QE_STOP_TX and so on
+ * @device: which sub-block will run the command, QE_CR_SUBBLOCK_UCCFAST1 - 8
+ * , QE_CR_SUBBLOCK_UCCSLOW1 - 8, QE_CR_SUBBLOCK_MCC1 - 3,
+ * QE_CR_SUBBLOCK_IDMA1 - 4 and such on.
+ * @mcn_protocol: specifies mode for the command for non-MCC, should be
+ * QE_CR_PROTOCOL_HDLC_TRANSPARENT, QE_CR_PROTOCOL_QMC, QE_CR_PROTOCOL_UART
+ * and such on.
+ * @cmd_input: command related data.
+ */
 int qe_issue_cmd(u32 cmd, u32 device, u8 mcn_protocol, u32 cmd_input)
 {
unsigned long flags;
u8 mcn_shift = 0, dev_shift = 0;
-   u32 ret;
+   int ret;
+   int i;
 
spin_lock_irqsave(&qe_lock, flags);
if (cmd == QE_RESET) {
-   out_be32(&qe_immr->cp.cecr, (u32) (cmd | QE_CR_FLG));
+   iowrite32be((cmd | QE_CR_FLG), &qe_immr->cp.cecr);
} else {
if (cmd == QE_ASSIGN_PAGE) {
/* Here device is the SNUM, not sub-block */
@@ -132,20 +142,26 @@ int qe_issue_cmd(u32 cmd, u32 device, u8 mcn_protocol, 
u32 cmd_input)
mcn_shift = QE_CR_MCN_NORMAL_SHIFT;
}
 
-   out_be32(&qe_immr->cp.cecdr, cmd_input);
-   out_be32(&qe_immr->cp.cecr,
-(cmd | QE_CR_FLG | ((u32) device << dev_shift) | (u32)
- mcn_protocol << mcn_shift));
+   iowrite32be(cmd_input, &qe_immr->cp.cecdr);
+   iowrite32be((cmd | QE_CR_FLG | ((u32)device << dev_shift) |
+   (u32)mcn_protocol << mcn_shift), &qe_immr->cp.cecr);
}
 
/* wait for the QE_CR_FLG to clear */
-   ret = spin_event_timeout((in_be32(&qe_immr->cp.cecr) & QE_CR_FLG) == 0,
-  100, 0);
+   ret = -EIO;
+   for (i = 0; i < 100; i++) {
+   if ((ioread32be(&qe_immr->cp.cecr) & QE_CR_FLG) == 0) {
+   ret = 0;
+  

Re: [PATCH 0/2][v2] Add the ability to do BPF directed error injection

2017-10-31 Thread David Miller
From: Josef Bacik 
Date: Tue, 31 Oct 2017 11:45:55 -0400

> v1->v2:
> - moved things around to make sure that bpf_override_return could really only 
> be
>   used for an ftrace kprobe.
> - killed the special return values from trace_call_bpf.
> - renamed pc_modified to bpf_kprobe_state so bpf_override_return could tell if
>   it was being called from an ftrace kprobe context.
> - reworked the logic in kprobe_perf_func to take advantage of 
> bpf_kprobe_state.
> - updated the test as per Alexei's review.
> 
> A lot of our error paths are not well tested because we have no good way of
> injecting errors generically.  Some subystems (block, memory) have ways to
> inject errors, but they are random so it's hard to get reproduceable results.
> 
> With BPF we can add determinism to our error injection.  We can use kprobes 
> and
> other things to verify we are injecting errors at the exact case we are trying
> to test.  This patch gives us the tool to actual do the error injection part.
> It is very simple, we just set the return value of the pt_regs we're given to
> whatever we provide, and then override the PC with a dummy function that 
> simply
> returns.
> 
> Right now this only works on x86, but it would be simple enough to expand to
> other architectures.  Thanks,

This appears to moreso target the tracing tree than the networking tree.

Let me know if that's not the case and I should be the one intergrating
these changes.

Thanks.


[Patch v11 1/4] irqchip/qeic: merge qeic init code from platforms to a common function

2017-10-31 Thread Zhao Qiang
The codes of qe_ic init from a variety of platforms are redundant,
merge them to a common function and put it to irqchip/irq-qeic.c

For non-p1021_mds mpc85xx_mds boards, use "qe_ic_init(np, 0,
qe_ic_cascade_low_mpic, qe_ic_cascade_high_mpic);" instead of
"qe_ic_init(np, 0, qe_ic_cascade_muxed_mpic, NULL);".

qe_ic_cascade_muxed_mpic was used for boards has the same interrupt
number for low interrupt and high interrupt, qe_ic_init has checked
if "low interrupt == high interrupt"

Signed-off-by: Zhao Qiang 
---
 arch/powerpc/platforms/83xx/misc.c| 15 ---
 arch/powerpc/platforms/85xx/corenet_generic.c |  9 -
 arch/powerpc/platforms/85xx/mpc85xx_mds.c | 14 --
 arch/powerpc/platforms/85xx/mpc85xx_rdb.c | 16 
 arch/powerpc/platforms/85xx/twr_p102x.c   | 14 --
 drivers/irqchip/irq-qeic.c| 13 +
 6 files changed, 13 insertions(+), 68 deletions(-)

diff --git a/arch/powerpc/platforms/83xx/misc.c 
b/arch/powerpc/platforms/83xx/misc.c
index d75c9816a5c9..c09a13532c89 100644
--- a/arch/powerpc/platforms/83xx/misc.c
+++ b/arch/powerpc/platforms/83xx/misc.c
@@ -93,24 +93,9 @@ void __init mpc83xx_ipic_init_IRQ(void)
 }
 
 #ifdef CONFIG_QUICC_ENGINE
-void __init mpc83xx_qe_init_IRQ(void)
-{
-   struct device_node *np;
-
-   np = of_find_compatible_node(NULL, NULL, "fsl,qe-ic");
-   if (!np) {
-   np = of_find_node_by_type(NULL, "qeic");
-   if (!np)
-   return;
-   }
-   qe_ic_init(np, 0, qe_ic_cascade_low_ipic, qe_ic_cascade_high_ipic);
-   of_node_put(np);
-}
-
 void __init mpc83xx_ipic_and_qe_init_IRQ(void)
 {
mpc83xx_ipic_init_IRQ();
-   mpc83xx_qe_init_IRQ();
 }
 #endif /* CONFIG_QUICC_ENGINE */
 
diff --git a/arch/powerpc/platforms/85xx/corenet_generic.c 
b/arch/powerpc/platforms/85xx/corenet_generic.c
index ac191a7a1337..1b385acbf4dd 100644
--- a/arch/powerpc/platforms/85xx/corenet_generic.c
+++ b/arch/powerpc/platforms/85xx/corenet_generic.c
@@ -41,8 +41,6 @@ void __init corenet_gen_pic_init(void)
unsigned int flags = MPIC_BIG_ENDIAN | MPIC_SINGLE_DEST_CPU |
MPIC_NO_RESET;
 
-   struct device_node *np;
-
if (ppc_md.get_irq == mpic_get_coreint_irq)
flags |= MPIC_ENABLE_COREINT;
 
@@ -50,13 +48,6 @@ void __init corenet_gen_pic_init(void)
BUG_ON(mpic == NULL);
 
mpic_init(mpic);
-
-   np = of_find_compatible_node(NULL, NULL, "fsl,qe-ic");
-   if (np) {
-   qe_ic_init(np, 0, qe_ic_cascade_low_mpic,
-   qe_ic_cascade_high_mpic);
-   of_node_put(np);
-   }
 }
 
 /*
diff --git a/arch/powerpc/platforms/85xx/mpc85xx_mds.c 
b/arch/powerpc/platforms/85xx/mpc85xx_mds.c
index d7e440e6dba3..06f34a99152e 100644
--- a/arch/powerpc/platforms/85xx/mpc85xx_mds.c
+++ b/arch/powerpc/platforms/85xx/mpc85xx_mds.c
@@ -283,20 +283,6 @@ static void __init mpc85xx_mds_qeic_init(void)
of_node_put(np);
return;
}
-
-   np = of_find_compatible_node(NULL, NULL, "fsl,qe-ic");
-   if (!np) {
-   np = of_find_node_by_type(NULL, "qeic");
-   if (!np)
-   return;
-   }
-
-   if (machine_is(p1021_mds))
-   qe_ic_init(np, 0, qe_ic_cascade_low_mpic,
-   qe_ic_cascade_high_mpic);
-   else
-   qe_ic_init(np, 0, qe_ic_cascade_muxed_mpic, NULL);
-   of_node_put(np);
 }
 #else
 static void __init mpc85xx_mds_qe_init(void) { }
diff --git a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c 
b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
index 10069503e39f..000d385933af 100644
--- a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
+++ b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
@@ -48,10 +48,6 @@ void __init mpc85xx_rdb_pic_init(void)
 {
struct mpic *mpic;
 
-#ifdef CONFIG_QUICC_ENGINE
-   struct device_node *np;
-#endif
-
if (of_machine_is_compatible("fsl,MPC85XXRDB-CAMP")) {
mpic = mpic_alloc(NULL, 0, MPIC_NO_RESET |
MPIC_BIG_ENDIAN |
@@ -66,18 +62,6 @@ void __init mpc85xx_rdb_pic_init(void)
 
BUG_ON(mpic == NULL);
mpic_init(mpic);
-
-#ifdef CONFIG_QUICC_ENGINE
-   np = of_find_compatible_node(NULL, NULL, "fsl,qe-ic");
-   if (np) {
-   qe_ic_init(np, 0, qe_ic_cascade_low_mpic,
-   qe_ic_cascade_high_mpic);
-   of_node_put(np);
-
-   } else
-   pr_err("%s: Could not find qe-ic node\n", __func__);
-#endif
-
 }
 
 /*
diff --git a/arch/powerpc/platforms/85xx/twr_p102x.c 
b/arch/powerpc/platforms/85xx/twr_p102x.c
index 360f6253e9ff..6be9b337035a 100644
--- a/arch/powerpc/platforms/85xx/twr_p102x.c
+++ b/arch/powerpc/platforms/85xx/twr_p102x.c
@@ -35,26 +35,12 @@ static void __init twr_p1025_pic_init(void)
 {
struct mpic *mpic;
 
-#ifdef CONFIG_Q

[Patch v11 0/4] This patchset is to remove PPCisms for QEIC

2017-10-31 Thread Zhao Qiang
QEIC is an interrupt controller for QE, was put under drivers/soc/fsl/qe,
and now move to driver/irqchip.
And QEIC is supported more than just powerpc boards, so remove PPCisms.

changelog:
Changes for v8:
- use IRQCHIP_DECLARE() instead of subsys_initcall in qeic driver
- remove include/soc/fsl/qe/qe_ic.h
Changes for v9:
- rebase 
- fix the compile issue when apply the second patch, in fact, there was 
no compile issue 
  when apply all the patches of this patchset
Changes for v10:
- simplify codes, remove duplicated codes 
Changes for v11:
- rebase

Zhao Qiang (4):
  irqchip/qeic: move qeic driver from drivers/soc/fsl/qe
Changes for v2:
- modify the subject and commit msg
Changes for v3:
- merge .h file to .c, rename it with irq-qeic.c
Changes for v4:
- modify comments
Changes for v5:
- disable rename detection
Changes for v6:
- rebase
Changes for v7:
- na

  irqchip/qeic: merge qeic init code from platforms to a common function
Changes for v2:
- modify subject and commit msg
- add check for qeic by type
Changes for v3:
- na
Changes for v4:
- na
Changes for v5:
- na
Changes for v6:
- rebase
Changes for v7:
- na
Changes for v8:
- use IRQCHIP_DECLARE() instead of subsys_initcall

  irqchip/qeic: merge qeic_of_init into qe_ic_init
Changes for v2:
- modify subject and commit msg
- return 0 and add put node when return in qe_ic_init
Changes for v3:
- na
Changes for v4:
- na
Changes for v5:
- na
Changes for v6:
- rebase
Changes for v7:
- na

  irqchip/qeic: remove PPCisms for QEIC
Changes for v6:
- new added
Changes for v7:
- fix warning
Changes for v8:
- remove include/soc/fsl/qe/qe_ic.h


*** BLURB HERE ***

Zhao Qiang (4):
  irqchip/qeic: merge qeic init code from platforms to a common function
  irqchip/qeic: merge qeic_of_init into qe_ic_init
  irqchip/qeic: remove PPCisms for QEIC
  QE: remove PPCisms for QE

 arch/powerpc/platforms/83xx/km83xx.c  |   1 -
 arch/powerpc/platforms/83xx/misc.c|  16 --
 arch/powerpc/platforms/83xx/mpc832x_mds.c |   1 -
 arch/powerpc/platforms/83xx/mpc832x_rdb.c |   1 -
 arch/powerpc/platforms/83xx/mpc836x_mds.c |   1 -
 arch/powerpc/platforms/83xx/mpc836x_rdk.c |   1 -
 arch/powerpc/platforms/85xx/corenet_generic.c |  10 -
 arch/powerpc/platforms/85xx/mpc85xx_mds.c |  15 --
 arch/powerpc/platforms/85xx/mpc85xx_rdb.c |  17 --
 arch/powerpc/platforms/85xx/twr_p102x.c   |  15 --
 drivers/irqchip/irq-qeic.c| 263 --
 drivers/net/ethernet/freescale/Kconfig|  11 +-
 drivers/soc/fsl/qe/Kconfig|   2 +-
 drivers/soc/fsl/qe/qe.c   |  82 
 drivers/soc/fsl/qe/qe_io.c|  42 ++--
 drivers/soc/fsl/qe/qe_tdm.c   |   8 +-
 drivers/soc/fsl/qe/ucc.c  |  10 +-
 drivers/soc/fsl/qe/ucc_fast.c |  74 
 drivers/tty/serial/Kconfig|   2 +-
 drivers/tty/serial/ucc_uart.c |   1 +
 drivers/usb/gadget/udc/Kconfig|   2 +-
 drivers/usb/host/Kconfig  |   2 +-
 include/soc/fsl/qe/qe.h   |   1 -
 include/soc/fsl/qe/qe_ic.h| 139 --
 24 files changed, 244 insertions(+), 473 deletions(-)
 delete mode 100644 include/soc/fsl/qe/qe_ic.h

-- 
2.14.1



Re: [PATCH] kprobes, x86/alternatives: use text_mutex to protect smp_alt_modules

2017-10-31 Thread zhouchengming

On 2017/11/1 5:59, Steven Rostedt wrote:

On Mon, 30 Oct 2017 17:03:23 +0900
Masami Hiramatsu  wrote:


  static LIST_HEAD(smp_alt_modules);
-static DEFINE_MUTEX(smp_alt);
-static bool uniproc_patched = false;   /* protected by smp_alt */
+static bool uniproc_patched = false;   /* protected by text_mutex */

We should also add a comment somewhere by the text_mutex that it is
protecting this on x86.


Good, I will send a patch-v2 adding this comment.

Thanks!


-- Steve

.






[PATCH 2/8] ls1043ardb: add ds26522 node to dts

2017-10-31 Thread Zhao Qiang
add ds26522 node to fsl-ls1043a-rdb.dts

Signed-off-by: Zhao Qiang 
---
 arch/arm64/boot/dts/freescale/fsl-ls1043a-rdb.dts | 16 
 1 file changed, 16 insertions(+)

diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1043a-rdb.dts 
b/arch/arm64/boot/dts/freescale/fsl-ls1043a-rdb.dts
index 8f23f39..11213ce 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls1043a-rdb.dts
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1043a-rdb.dts
@@ -130,6 +130,22 @@
reg = <0>;
spi-max-frequency = <100>; /* input clock */
};
+
+   slic@2 {
+   compatible = "maxim,ds26522";
+   reg = <2>;
+   spi-max-frequency = <200>;
+   fsl,spi-cs-sck-delay = <100>;
+   fsl,spi-sck-cs-delay = <50>;
+   };
+
+   slic@3 {
+   compatible = "maxim,ds26522";
+   reg = <3>;
+   spi-max-frequency = <200>;
+   fsl,spi-cs-sck-delay = <100>;
+   fsl,spi-sck-cs-delay = <50>;
+   };
 };
 
 &uqe {
-- 
2.1.0.27.g96db324



[PATCH 3/8] QE: remove PPCisms for QE

2017-10-31 Thread Zhao Qiang
QE was supported on PowerPC, and dependent on PPC,
Now it is supported on other platforms. so remove PPCisms.

Signed-off-by: Zhao Qiang 
---
 drivers/net/ethernet/freescale/Kconfig | 11 ++---
 drivers/soc/fsl/qe/Kconfig |  2 +-
 drivers/soc/fsl/qe/qe.c| 80 --
 drivers/soc/fsl/qe/qe_io.c | 42 --
 drivers/soc/fsl/qe/qe_tdm.c|  8 ++--
 drivers/soc/fsl/qe/ucc.c   | 10 ++---
 drivers/soc/fsl/qe/ucc_fast.c  | 74 ---
 drivers/tty/serial/Kconfig |  2 +-
 drivers/tty/serial/ucc_uart.c  |  1 +
 drivers/usb/gadget/udc/Kconfig |  2 +-
 drivers/usb/host/Kconfig   |  2 +-
 include/soc/fsl/qe/qe.h|  1 -
 12 files changed, 124 insertions(+), 111 deletions(-)

diff --git a/drivers/net/ethernet/freescale/Kconfig 
b/drivers/net/ethernet/freescale/Kconfig
index 6e490fd2..015bdb8 100644
--- a/drivers/net/ethernet/freescale/Kconfig
+++ b/drivers/net/ethernet/freescale/Kconfig
@@ -5,10 +5,11 @@
 config NET_VENDOR_FREESCALE
bool "Freescale devices"
default y
-   depends on FSL_SOC || QUICC_ENGINE || CPM1 || CPM2 || PPC_MPC512x || \
-  M523x || M527x || M5272 || M528x || M520x || M532x || \
-  ARCH_MXC || ARCH_MXS || (PPC_MPC52xx && PPC_BESTCOMM) || \
-  ARCH_LAYERSCAPE || COMPILE_TEST
+   depends on FSL_SOC || (QUICC_ENGINE && PPC32) || CPM1 || CPM2 || \
+  PPC_MPC512x || M523x || M527x || M5272 || M528x || M520x || \
+  M532x || ARCH_MXC || ARCH_MXS || \
+  (PPC_MPC52xx && PPC_BESTCOMM) || ARCH_LAYERSCAPE || \
+  COMPILE_TEST
---help---
  If you have a network (Ethernet) card belonging to this class, say Y.
 
@@ -73,7 +74,7 @@ config FSL_XGMAC_MDIO
 
 config UCC_GETH
tristate "Freescale QE Gigabit Ethernet"
-   depends on QUICC_ENGINE
+   depends on QUICC_ENGINE && FSL_SOC && PPC32
select FSL_PQ_MDIO
select PHYLIB
---help---
diff --git a/drivers/soc/fsl/qe/Kconfig b/drivers/soc/fsl/qe/Kconfig
index 73a2e08..b26b643 100644
--- a/drivers/soc/fsl/qe/Kconfig
+++ b/drivers/soc/fsl/qe/Kconfig
@@ -4,7 +4,7 @@
 
 config QUICC_ENGINE
bool "Freescale QUICC Engine (QE) Support"
-   depends on FSL_SOC && PPC32
+   depends on OF && HAS_IOMEM
select GENERIC_ALLOCATOR
select CRC32
help
diff --git a/drivers/soc/fsl/qe/qe.c b/drivers/soc/fsl/qe/qe.c
index ade168f..52aaf41 100644
--- a/drivers/soc/fsl/qe/qe.c
+++ b/drivers/soc/fsl/qe/qe.c
@@ -33,8 +33,6 @@
 #include 
 #include 
 #include 
-#include 
-#include 
 
 static void qe_snums_init(void);
 static int qe_sdma_init(void);
@@ -109,15 +107,27 @@ void qe_reset(void)
panic("sdma init failed!");
 }
 
+/* issue commands to QE, return 0 on success while -EIO on error
+ *
+ * @cmd: the command code, should be QE_INIT_TX_RX, QE_STOP_TX and so on
+ * @device: which sub-block will run the command, QE_CR_SUBBLOCK_UCCFAST1 - 8
+ * , QE_CR_SUBBLOCK_UCCSLOW1 - 8, QE_CR_SUBBLOCK_MCC1 - 3,
+ * QE_CR_SUBBLOCK_IDMA1 - 4 and such on.
+ * @mcn_protocol: specifies mode for the command for non-MCC, should be
+ * QE_CR_PROTOCOL_HDLC_TRANSPARENT, QE_CR_PROTOCOL_QMC, QE_CR_PROTOCOL_UART
+ * and such on.
+ * @cmd_input: command related data.
+ */
 int qe_issue_cmd(u32 cmd, u32 device, u8 mcn_protocol, u32 cmd_input)
 {
unsigned long flags;
u8 mcn_shift = 0, dev_shift = 0;
-   u32 ret;
+   int ret;
+   int i;
 
spin_lock_irqsave(&qe_lock, flags);
if (cmd == QE_RESET) {
-   out_be32(&qe_immr->cp.cecr, (u32) (cmd | QE_CR_FLG));
+   iowrite32be((cmd | QE_CR_FLG), &qe_immr->cp.cecr);
} else {
if (cmd == QE_ASSIGN_PAGE) {
/* Here device is the SNUM, not sub-block */
@@ -134,20 +144,26 @@ int qe_issue_cmd(u32 cmd, u32 device, u8 mcn_protocol, 
u32 cmd_input)
mcn_shift = QE_CR_MCN_NORMAL_SHIFT;
}
 
-   out_be32(&qe_immr->cp.cecdr, cmd_input);
-   out_be32(&qe_immr->cp.cecr,
-(cmd | QE_CR_FLG | ((u32) device << dev_shift) | (u32)
- mcn_protocol << mcn_shift));
+   iowrite32be(cmd_input, &qe_immr->cp.cecdr);
+   iowrite32be((cmd | QE_CR_FLG | ((u32)device << dev_shift) |
+   (u32)mcn_protocol << mcn_shift), &qe_immr->cp.cecr);
}
 
/* wait for the QE_CR_FLG to clear */
-   ret = spin_event_timeout((in_be32(&qe_immr->cp.cecr) & QE_CR_FLG) == 0,
-  100, 0);
+   ret = -EIO;
+   for (i = 0; i < 100; i++) {
+   if ((ioread32be(&qe_immr->cp.cecr) & QE_CR_FLG) == 0) {
+   ret = 0;
+   break;
+   }
+

[PATCH 4/8] irqchip/qeic: move qeic driver from drivers/soc/fsl/qe

2017-10-31 Thread Zhao Qiang
move the driver from drivers/soc/fsl/qe to drivers/irqchip,
merge qe_ic.h and qe_ic.c into irq-qeic.c.

Signed-off-by: Zhao Qiang 
---
 drivers/irqchip/Makefile|   1 +
 drivers/irqchip/irq-qeic.c  | 601 
 drivers/soc/fsl/qe/Makefile |   2 +-
 drivers/soc/fsl/qe/qe_ic.c  | 512 -
 drivers/soc/fsl/qe/qe_ic.h  | 103 
 5 files changed, 603 insertions(+), 616 deletions(-)
 create mode 100644 drivers/irqchip/irq-qeic.c
 delete mode 100644 drivers/soc/fsl/qe/qe_ic.c
 delete mode 100644 drivers/soc/fsl/qe/qe_ic.h

diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index 0e55d94..627c5d6 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -75,3 +75,4 @@ obj-$(CONFIG_LS_SCFG_MSI) += irq-ls-scfg-msi.o
 obj-$(CONFIG_EZNPS_GIC)+= irq-eznps.o
 obj-$(CONFIG_ARCH_ASPEED)  += irq-aspeed-vic.o
 obj-$(CONFIG_STM32_EXTI)   += irq-stm32-exti.o
+obj-$(CONFIG_QUICC_ENGINE) += irq-qeic.o
diff --git a/drivers/irqchip/irq-qeic.c b/drivers/irqchip/irq-qeic.c
new file mode 100644
index 000..48ceded
--- /dev/null
+++ b/drivers/irqchip/irq-qeic.c
@@ -0,0 +1,601 @@
+/*
+ * drivers/irqchip/irq-qeic.c
+ *
+ * Copyright (C) 2016 Freescale Semiconductor, Inc.  All rights reserved.
+ *
+ * Author: Li Yang 
+ * Based on code from Shlomi Gridish 
+ *
+ * QUICC ENGINE Interrupt Controller
+ *
+ * This program is free software; you can redistribute  it and/or modify it
+ * under  the terms of  the GNU General  Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define NR_QE_IC_INTS  64
+
+/* QE IC registers offset */
+#define QEIC_CICR  0x00
+#define QEIC_CIVEC 0x04
+#define QEIC_CRIPNR0x08
+#define QEIC_CIPNR 0x0c
+#define QEIC_CIPXCC0x10
+#define QEIC_CIPYCC0x14
+#define QEIC_CIPWCC0x18
+#define QEIC_CIPZCC0x1c
+#define QEIC_CIMR  0x20
+#define QEIC_CRIMR 0x24
+#define QEIC_CICNR 0x28
+#define QEIC_CIPRTA0x30
+#define QEIC_CIPRTB0x34
+#define QEIC_CRICR 0x3c
+#define QEIC_CHIVEC0x60
+
+/* Interrupt priority registers */
+#define CIPCC_SHIFT_PRI0   29
+#define CIPCC_SHIFT_PRI1   26
+#define CIPCC_SHIFT_PRI2   23
+#define CIPCC_SHIFT_PRI3   20
+#define CIPCC_SHIFT_PRI4   13
+#define CIPCC_SHIFT_PRI5   10
+#define CIPCC_SHIFT_PRI6   7
+#define CIPCC_SHIFT_PRI7   4
+
+/* CICR priority modes */
+#define CICR_GWCC  0x0004
+#define CICR_GXCC  0x0002
+#define CICR_GYCC  0x0001
+#define CICR_GZCC  0x0008
+#define CICR_GRTA  0x0020
+#define CICR_GRTB  0x0040
+#define CICR_HPIT_SHIFT8
+#define CICR_HPIT_MASK 0x0300
+#define CICR_HP_SHIFT  24
+#define CICR_HP_MASK   0x3f00
+
+/* CICNR */
+#define CICNR_WCC1T_SHIFT  20
+#define CICNR_ZCC1T_SHIFT  28
+#define CICNR_YCC1T_SHIFT  12
+#define CICNR_XCC1T_SHIFT  4
+
+/* CRICR */
+#define CRICR_RTA1T_SHIFT  20
+#define CRICR_RTB1T_SHIFT  28
+
+/* Signal indicator */
+#define SIGNAL_MASK3
+#define SIGNAL_HIGH2
+#define SIGNAL_LOW 0
+
+struct qe_ic {
+   /* Control registers offset */
+   volatile u32 __iomem *regs;
+
+   /* The remapper for this QEIC */
+   struct irq_domain *irqhost;
+
+   /* The "linux" controller struct */
+   struct irq_chip hc_irq;
+
+   /* VIRQ numbers of QE high/low irqs */
+   unsigned int virq_high;
+   unsigned int virq_low;
+};
+
+/*
+ * QE interrupt controller internal structure
+ */
+struct qe_ic_info {
+   /* location of this source at the QIMR register. */
+   u32 mask;
+
+   /* Mask register offset */
+   u32 mask_reg;
+
+   /*
+* for grouped interrupts sources - the interrupt
+* code as appears at the group priority register
+*/
+   u8  pri_code;
+
+   /* Group priority register offset */
+   u32 pri_reg;
+};
+
+static DEFINE_RAW_SPINLOCK(qe_ic_lock);
+
+static struct qe_ic_info qe_ic_info[] = {
+   [1] = {
+  .mask = 0x8000,
+  .mask_reg = QEIC_CIMR,
+  .pri_code = 0,
+  .pri_reg = QEIC_CIPWCC,
+  },
+   [2] = {
+  .mask = 0x4000,
+  .mask_reg = QEIC_CIMR,
+  .pri_code = 1,
+  .pri_reg = QEIC_CIPWCC,
+  },
+   [3] = {
+  .mask = 0x2000,
+   

Re: [PATCH] hwmon: (aspeed-pwm-tacho) Deassert reset in probe

2017-10-31 Thread Guenter Roeck

On 10/31/2017 06:34 PM, Joel Stanley wrote:

The ASPEED SoC must deassert a reset in order to use the PWM/tach
peripheral.

The device tree bindings are updated to document the resets phandle, and
the example is updated to match what is expected for both the reset and
clock phandle. Note that the bindings should have always had the reset
controller, as the hardware is unusable without it.

Signed-off-by: Joel Stanley 


Presumably the driver is being used. This change makes it incompatible with
existing users. This is unacceptable; after all, it is possible that the
device is taken out of reset by ROMMON or BIOS.

On top of that, the reset controller code is quite strict and issues a
backtrace if CONFIG_RESET_CONTROLLER is not enabled. Yet, there is no
dependency added on RESET_CONTROLLER. You might want to consider making
the new control optional and using devm_reset_control_get_optional_exclusive().

The DT change should be a separate patch.

More comments below.


---
  .../devicetree/bindings/hwmon/aspeed-pwm-tacho.txt | 14 ---
  drivers/hwmon/aspeed-pwm-tacho.c   | 27 +++---
  2 files changed, 29 insertions(+), 12 deletions(-)

diff --git a/Documentation/devicetree/bindings/hwmon/aspeed-pwm-tacho.txt 
b/Documentation/devicetree/bindings/hwmon/aspeed-pwm-tacho.txt
index 367c8203213b..3ac02988a1a5 100644
--- a/Documentation/devicetree/bindings/hwmon/aspeed-pwm-tacho.txt
+++ b/Documentation/devicetree/bindings/hwmon/aspeed-pwm-tacho.txt
@@ -22,8 +22,9 @@ Required properties for pwm-tacho node:
  - compatible : should be "aspeed,ast2400-pwm-tacho" for AST2400 and
   "aspeed,ast2500-pwm-tacho" for AST2500.
  
-- clocks : a fixed clock providing input clock frequency(PWM

-  and Fan Tach clock)
+- clocks : phandle to clock provider with the clock number in the second cell
+
+- resets : phandle to reset controller with the reset number in the second cell
  
  fan subnode format:

  ===
@@ -48,19 +49,14 @@ Required properties for each child node:
  
  Examples:
  
-pwm_tacho_fixed_clk: fixedclk {

-   compatible = "fixed-clock";
-   #clock-cells = <0>;
-   clock-frequency = <2400>;
-};
-
  pwm_tacho: pwmtachocontroller@1e786000 {
#address-cells = <1>;
#size-cells = <1>;
#cooling-cells = <2>;
reg = <0x1E786000 0x1000>;
compatible = "aspeed,ast2500-pwm-tacho";
-   clocks = <&pwm_tacho_fixed_clk>;
+   clocks = <&syscon ASPEED_CLK_APB>;
+   resets = <&syscon ASPEED_RESET_PWM>;
pinctrl-names = "default";
pinctrl-0 = <&pinctrl_pwm0_default &pinctrl_pwm1_default>;
  
diff --git a/drivers/hwmon/aspeed-pwm-tacho.c b/drivers/hwmon/aspeed-pwm-tacho.c

index f914e5f41048..346a4c5952a3 100644
--- a/drivers/hwmon/aspeed-pwm-tacho.c
+++ b/drivers/hwmon/aspeed-pwm-tacho.c
@@ -7,19 +7,20 @@
   */
  
  #include 

+#include 
  #include 
  #include 
-#include 

Unrelated change.


  #include 
  #include 
  #include 
  #include 
  #include 
-#include 
  #include 
+#include 


Same here. I don't mind the include file reordering, but as a separate patch, 
please.


  #include 
-#include 
  #include 
+#include 
+#include 
  #include 
  
  /* ASPEED PWM & FAN Tach Register Definition */

@@ -181,6 +182,7 @@ struct aspeed_cooling_device {
  
  struct aspeed_pwm_tacho_data {

struct regmap *regmap;
+   struct reset_control *rst;
unsigned long clk_freq;
bool pwm_present[8];
bool fan_tach_present[16];
@@ -931,6 +933,15 @@ static int aspeed_pwm_tacho_probe(struct platform_device 
*pdev)
&aspeed_pwm_tacho_regmap_config);
if (IS_ERR(priv->regmap))
return PTR_ERR(priv->regmap);
+
+   priv->rst = devm_reset_control_get_exclusive(dev, NULL);
+   if (IS_ERR(priv->rst)) {
+   dev_err(dev,
+   "missing or invalid reset controller device tree 
entry");
+   return PTR_ERR(priv->rst);
+   }
+   reset_control_deassert(priv->rst);
+
regmap_write(priv->regmap, ASPEED_PTCR_TACH_SOURCE, 0);
regmap_write(priv->regmap, ASPEED_PTCR_TACH_SOURCE_EXT, 0);
  
@@ -960,6 +971,15 @@ static int aspeed_pwm_tacho_probe(struct platform_device *pdev)

return PTR_ERR_OR_ZERO(hwmon);
  }
  
+static int aspeed_pwm_tacho_remove(struct platform_device *pdev)

+{
+   struct aspeed_pwm_tacho_data *priv = platform_get_drvdata(pdev);
+
+   reset_control_deassert(priv->rst);


This seems to be quite pointless. Also, did you test this code ?


+
+   return 0;
+}
+
  static const struct of_device_id of_pwm_tacho_match_table[] = {
{ .compatible = "aspeed,ast2400-pwm-tacho", },
{ .compatible = "aspeed,ast2500-pwm-tacho", },
@@ -969,6 +989,7 @@ MODULE_DEVICE_TABLE(of, of_pwm_tacho_match_table);
  
  static struct platform_driver aspeed_pwm_tacho_driver = {

.probe  = aspeed_pwm_tacho_probe,
+   .probe  = a

Re: pull-request: wireless-drivers 2017-10-31

2017-10-31 Thread David Miller
From: Kalle Valo 
Date: Tue, 31 Oct 2017 17:19:24 +0200

> here's a pull request to net tree for 4.14. Due to the ath10k security
> issue I would like to get this to 4.14 still.
> 
> Please let me know if there are any problems.

Pulled, thanks a lot.


Re: xfs: list corruption in xfs_setup_inode()

2017-10-31 Thread Cong Wang
On Mon, Oct 30, 2017 at 5:33 PM, Dave Chinner  wrote:
> On Mon, Oct 30, 2017 at 02:55:43PM -0700, Cong Wang wrote:
>> Hello,
>>
>> We triggered a list corruption (double add) warning below on our 4.9
>> kernel (the 4.9 kernel we use is based on -stable release, with only a
>> few unrelated networking backports):
>>
>>
>> WARNING: CPU: 5 PID: 628 at lib/list_debug.c:36 __list_add+0xac/0xb0
>> list_add double add: new=8d9d691e0aa0, prev=8d9d7a716608,
>> next=8d9d691e0aa0.
>> Modules linked in: raid0 tcp_diag inet_diag intel_rapl
>> x86_pkg_temp_thermal coretemp iTCO_wdt iTCO_vendor_support
>> crct10dif_pclmul crc32_pclmul ghash_clmulni_intel mpt3sas raid_class
>> scsi_transport_sas i2c_i801 i2c_smbus i2c_core ie31200_edac lpc_ich
>> shpchp edac_core video ipmi_si ipmi_devintf ipmi_msghandler
>> acpi_cpufreq sch_fq_codel xfs libcrc32c crc32c_intel e1000e ptp
>> pps_core
>> CPU: 5 PID: 628 Comm: systemd-tmpfile Tainted: GW
>
> Kernel was already tainted before this warning was triggered. What
> was the previous warning(s) that the kernel threw?

Ah, there was a same warning right before the above one:


:[   19.953754] EXT4-fs (md0): mounted filesystem with writeback data
mode. Opts: errors=remount-ro,data=writeback
:[   19.979051] [ cut here ]
:[   19.979216] WARNING: CPU: 3 PID: 628 at lib/list_debug.c:36
__list_add+0xac/0xb0
:[   19.979470] list_add double add: new=8d9d691d72a0,
prev=8d9d7a716608, next=8d9d691d72a0.
:[   19.979780] Modules linked in: raid0 tcp_diag inet_diag intel_rapl
x86_pkg_temp_thermal coretemp iTCO_wdt iTCO_vendor_support
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel mpt3sas raid_class
scsi_transport_sas i2c_i801 i2c_smbus i2c_core ie31200_edac lpc_ich
shpchp edac_core video ipmi_si ipmi_devintf ipmi_msghandler
acpi_cpufreq sch_fq_codel xfs libcrc32c crc32c_intel e1000e ptp
pps_core
:[   19.981201] CPU: 3 PID: 628 Comm: systemd-tmpfile Not tainted
4.9.34.el7.x86_64 #1
:[   19.981491] Hardware name: TYAN S5512/S5512, BIOS V8.B13 03/20/2014
:[   19.981706]  b0d48a0abb30 8e389f47 b0d48a0abb80

:[   19.982000]  b0d48a0abb70 8e08989b 0024
8d9d691d72a0
:[   19.982278]  8d9d7a716608 8d9d691d72a0 4000
8d9d7de6d800
:[   19.982555] Call Trace:
:[   19.982645]  [] dump_stack+0x4d/0x66
:[   19.982823]  [] __warn+0xcb/0xf0
:[   19.983007]  [] warn_slowpath_fmt+0x5f/0x80
:[   19.983205]  [] __list_add+0xac/0xb0
:[   19.983383]  [] inode_sb_list_add+0x3b/0x50
:[   19.983610]  [] xfs_setup_inode+0x2c/0x170 [xfs]
:[   19.983837]  [] xfs_ialloc+0x317/0x5c0 [xfs]
:[   19.984072]  [] xfs_dir_ialloc+0x77/0x220 [xfs]
:[   19.984283]  [] ? down_write+0x12/0x40
:[   19.984481]  [] xfs_create+0x482/0x760 [xfs]
:[   19.984697]  [] xfs_generic_create+0x21e/0x2c0 [xfs]
:[   19.984955]  [] xfs_vn_mknod+0x14/0x20 [xfs]
:[   19.985171]  [] xfs_vn_mkdir+0x16/0x20 [xfs]
:[   19.985373]  [] vfs_mkdir+0xe8/0x140
:[   19.985551]  [] SyS_mkdir+0x7a/0xf0
:[   19.985726]  [] entry_SYSCALL_64_fastpath+0x13/0x94
:[   19.985987] ---[ end trace b461c28386dac363 ]---
:[   19.987613] [ cut here ]



>
>> 4.9.34.el7.x86_64 #1
>> Hardware name: TYAN S5512/S5512, BIOS V8.B13 03/20/2014
>>  b0d48a0abb30 8e389f47 b0d48a0abb80 
>>  b0d48a0abb70 8e08989b 0024 8d9d691e0aa0
>>  8d9d7a716608 8d9d691e0aa0 4000 8d9d7de6d800
>> Call Trace:
>>  [] dump_stack+0x4d/0x66
>>  [] __warn+0xcb/0xf0
>>  [] warn_slowpath_fmt+0x5f/0x80
>>  [] __list_add+0xac/0xb0
>>  [] inode_sb_list_add+0x3b/0x50
>>  [] xfs_setup_inode+0x2c/0x170 [xfs]
>>  [] xfs_ialloc+0x317/0x5c0 [xfs]
>>  [] xfs_dir_ialloc+0x77/0x220 [xfs]
>
> Inode allocation, so should be a new inode straight from the slab
> cache. THat implies memory corruption of some kind. Please turn on
> slab poisoning and try to reproduce.

Are you sure? xfs_iget() seems searching in a cache before allocating
a new one:

ip = radix_tree_lookup(&pag->pag_ici_root, agino);

if (ip) {
error = xfs_iget_cache_hit(pag, ip, ino, flags, lock_flags);
if (error)
goto out_error_or_again;
} else {
rcu_read_unlock();
if (flags & XFS_IGET_INCORE) {
error = -ENOENT;
goto out_error_or_again;
}
XFS_STATS_INC(mp, xs_ig_missed);

error = xfs_iget_cache_miss(mp, pag, tp, ino, &ip,
flags, lock_flags);
if (error)
goto out_error_or_again;
}


>
>>  [] ? down_write+0x12/0x40
>>  [] xfs_create+0x482/0x760 [xfs]
>>  [] xfs_generic_create+0x21e/0x2c0 [xfs]
>>  [] xfs_vn_mknod+0x14/0x20 [xfs]
>>  [] xfs_vn_mkdir+0x16/0x20 [xfs]
>>  [] vfs_mkdir+0xe8/0x140
>>  [] SyS_mkdir+0x7a/0xf0

  1   2   3   4   5   6   7   8   9   10   >