Re: sg_io HARDENED_USERCOPY_PAGESPAN trace

2016-12-28 Thread Christoph Hellwig
On Wed, Dec 28, 2016 at 04:40:16PM -0500, Dave Jones wrote:
>  sg_io+0x113/0x470

Can you resolve that to a source line using a gdb?



Re: mm: fix typo of cache_alloc_zspage()

2016-12-28 Thread Minchan Kim
On Thu, Dec 29, 2016 at 04:34:03PM +0900, Sergey Senozhatsky wrote:
> Hello,
> 
> On (12/29/16 15:59), Minchan Kim wrote:
> [..]
> > > I don't know... do we want to have it as a separate patch?
> > > may be we can fold it into some other patch someday later.
> > 
> > Xishi spent his time to make the patch(review,create/send). And I want to
> > give a credit to him. :)
> 
> sure, I didn't mean "let's seize the credit" :)  my reasoning was
> that that patch hardly can be counted even as trivial. per
> documentation:
> 
> : Trivial patches must qualify for one of the following rules:
> :
> : - Spelling fixes in documentation
> : - Spelling fixes for errors which could break :manpage:`grep(1)`
> : - Warning fixes (cluttering with useless warnings is bad)
> : - Compilation fixes (only if they are actually correct)
> : - Runtime fixes (only if they actually fix things)
> : - Removing use of deprecated functions/macros
> : - Contact detail and documentation fixes
> : - Non-portable code replaced by portable code (even in arch-specific,
> :   since people copy, as long as it's trivial)
> : - Any fix by the author/maintainer of the file (ie. patch monkey
> :   in re-transmission mode)
> 
> 
> hence was my question. we can have it as "p.s. in this patch we also
> remove XYZ reported by Xishi Qiu".
> 
> but up to you.
> 
> 
> 
> for instance, we can have Xishi's fix up as part of this "fix documentation
> typos" patch. which can be counted in as trivial.

Xishi, Could you send your patch with fixing ones Sergey pointed out
if Sergey doesn't mind?

You should include Sergey's SOB, too.

> 
> 
> ---
> 
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index 9cc3c0b2c2c1..af7cd90c26f7 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -25,7 +25,7 @@
>   * Usage of struct page flags:
>   * PG_private: identifies the first component page
>   * PG_private2: identifies the last component page
> - * PG_owner_priv_1: indentifies the huge component page
> + * PG_owner_priv_1: identifies the huge component page
>   *
>   */
>  
> @@ -65,7 +65,7 @@
>  #define ZS_ALIGN   8
>  
>  /*
> - * A single 'zspage' is composed of up to 2^N discontiguous 0-order (single)
> + * A single 'zspage' is composed of up to 2^N discontinuous 0-order (single)

Hmm, discontinuous is right?
I'm not a native but discontiguos is wrong? "contiguous" was used mm part 
widely.


>   * pages. ZS_MAX_ZSPAGE_ORDER defines upper limit on N.
>   */
>  #define ZS_MAX_ZSPAGE_ORDER 2
> @@ -2383,7 +2383,7 @@ struct zs_pool *zs_create_pool(const char *name)
> goto err;
>  
> /*
> -* Iterate reversly, because, size of size_class that we want to use
> +* Iterate reversely, because, size of size_class that we want to use
>  * for merging should be larger or equal to current size.
>  */
> for (i = zs_size_classes - 1; i >= 0; i--) {
> 
> 
> ---
> 
>   -ss
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 


Re: sg_io HARDENED_USERCOPY_PAGESPAN trace

2016-12-28 Thread Christoph Hellwig
On Wed, Dec 28, 2016 at 04:40:16PM -0500, Dave Jones wrote:
>  sg_io+0x113/0x470

Can you resolve that to a source line using a gdb?



Re: mm: fix typo of cache_alloc_zspage()

2016-12-28 Thread Minchan Kim
On Thu, Dec 29, 2016 at 04:34:03PM +0900, Sergey Senozhatsky wrote:
> Hello,
> 
> On (12/29/16 15:59), Minchan Kim wrote:
> [..]
> > > I don't know... do we want to have it as a separate patch?
> > > may be we can fold it into some other patch someday later.
> > 
> > Xishi spent his time to make the patch(review,create/send). And I want to
> > give a credit to him. :)
> 
> sure, I didn't mean "let's seize the credit" :)  my reasoning was
> that that patch hardly can be counted even as trivial. per
> documentation:
> 
> : Trivial patches must qualify for one of the following rules:
> :
> : - Spelling fixes in documentation
> : - Spelling fixes for errors which could break :manpage:`grep(1)`
> : - Warning fixes (cluttering with useless warnings is bad)
> : - Compilation fixes (only if they are actually correct)
> : - Runtime fixes (only if they actually fix things)
> : - Removing use of deprecated functions/macros
> : - Contact detail and documentation fixes
> : - Non-portable code replaced by portable code (even in arch-specific,
> :   since people copy, as long as it's trivial)
> : - Any fix by the author/maintainer of the file (ie. patch monkey
> :   in re-transmission mode)
> 
> 
> hence was my question. we can have it as "p.s. in this patch we also
> remove XYZ reported by Xishi Qiu".
> 
> but up to you.
> 
> 
> 
> for instance, we can have Xishi's fix up as part of this "fix documentation
> typos" patch. which can be counted in as trivial.

Xishi, Could you send your patch with fixing ones Sergey pointed out
if Sergey doesn't mind?

You should include Sergey's SOB, too.

> 
> 
> ---
> 
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index 9cc3c0b2c2c1..af7cd90c26f7 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -25,7 +25,7 @@
>   * Usage of struct page flags:
>   * PG_private: identifies the first component page
>   * PG_private2: identifies the last component page
> - * PG_owner_priv_1: indentifies the huge component page
> + * PG_owner_priv_1: identifies the huge component page
>   *
>   */
>  
> @@ -65,7 +65,7 @@
>  #define ZS_ALIGN   8
>  
>  /*
> - * A single 'zspage' is composed of up to 2^N discontiguous 0-order (single)
> + * A single 'zspage' is composed of up to 2^N discontinuous 0-order (single)

Hmm, discontinuous is right?
I'm not a native but discontiguos is wrong? "contiguous" was used mm part 
widely.


>   * pages. ZS_MAX_ZSPAGE_ORDER defines upper limit on N.
>   */
>  #define ZS_MAX_ZSPAGE_ORDER 2
> @@ -2383,7 +2383,7 @@ struct zs_pool *zs_create_pool(const char *name)
> goto err;
>  
> /*
> -* Iterate reversly, because, size of size_class that we want to use
> +* Iterate reversely, because, size of size_class that we want to use
>  * for merging should be larger or equal to current size.
>  */
> for (i = zs_size_classes - 1; i >= 0; i--) {
> 
> 
> ---
> 
>   -ss
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 


Re: [PATCH 4/7] mm, vmscan: show LRU name in mm_vmscan_lru_isolate tracepoint

2016-12-28 Thread Michal Hocko
On Thu 29-12-16 15:02:04, Minchan Kim wrote:
> On Wed, Dec 28, 2016 at 04:30:29PM +0100, Michal Hocko wrote:
> > From: Michal Hocko 
> > 
> > mm_vmscan_lru_isolate currently prints only whether the LRU we isolate
> > from is file or anonymous but we do not know which LRU this is. It is
> > useful to know whether the list is file or anonymous as well. Change
> > the tracepoint to show symbolic names of the lru rather.
> > 
> > Signed-off-by: Michal Hocko 
> 
> Not exactly same with this but idea is almost same.
> I used almost same tracepoint to investigate agging(i.e., deactivating) 
> problem
> in 32b kernel with node-lru.
> It was enough. Namely, I didn't need tracepoint in shrink_active_list like 
> your
> first patch.
> Your first patch is more straightforwad and information. But as you introduced
> this patch, I want to ask in here.
> Isn't it enough with this patch without your first one to find a such problem?

I assume this should be a reply to
http://lkml.kernel.org/r/20161228153032.10821-8-mho...@kernel.org, right?
And you are right that for the particular problem it was enough to have
a tracepoint inside inactive_list_is_low and shrink_active_list one
wasn't really needed. On the other hand aging issues are really hard to
debug as well and so I think that both are useful. The first one tell us
_why_ we do aging while the later _how_ we do that.
-- 
Michal Hocko
SUSE Labs


Re: [PATCH 3/7] mm, vmscan: show the number of skipped pages in mm_vmscan_lru_isolate

2016-12-28 Thread Hillf Danton
On Wednesday, December 28, 2016 11:30 PM Michal Hocko wrote:
> From: Michal Hocko 
> 
> mm_vmscan_lru_isolate shows the number of requested, scanned and taken
> pages. This is mostly OK but on 32b systems the number of scanned pages
> is quite misleading because it includes both the scanned and skipped
> pages.  Moreover the skipped part is scaled based on the number of taken
> pages. Let's report the exact numbers without any additional logic and
> add the number of skipped pages. This should make the reported data much
> more easier to interpret.
> 
> Signed-off-by: Michal Hocko 
> ---
Acked-by: Hillf Danton  



Re: [PATCH 4/7] mm, vmscan: show LRU name in mm_vmscan_lru_isolate tracepoint

2016-12-28 Thread Michal Hocko
On Thu 29-12-16 15:02:04, Minchan Kim wrote:
> On Wed, Dec 28, 2016 at 04:30:29PM +0100, Michal Hocko wrote:
> > From: Michal Hocko 
> > 
> > mm_vmscan_lru_isolate currently prints only whether the LRU we isolate
> > from is file or anonymous but we do not know which LRU this is. It is
> > useful to know whether the list is file or anonymous as well. Change
> > the tracepoint to show symbolic names of the lru rather.
> > 
> > Signed-off-by: Michal Hocko 
> 
> Not exactly same with this but idea is almost same.
> I used almost same tracepoint to investigate agging(i.e., deactivating) 
> problem
> in 32b kernel with node-lru.
> It was enough. Namely, I didn't need tracepoint in shrink_active_list like 
> your
> first patch.
> Your first patch is more straightforwad and information. But as you introduced
> this patch, I want to ask in here.
> Isn't it enough with this patch without your first one to find a such problem?

I assume this should be a reply to
http://lkml.kernel.org/r/20161228153032.10821-8-mho...@kernel.org, right?
And you are right that for the particular problem it was enough to have
a tracepoint inside inactive_list_is_low and shrink_active_list one
wasn't really needed. On the other hand aging issues are really hard to
debug as well and so I think that both are useful. The first one tell us
_why_ we do aging while the later _how_ we do that.
-- 
Michal Hocko
SUSE Labs


Re: [PATCH 3/7] mm, vmscan: show the number of skipped pages in mm_vmscan_lru_isolate

2016-12-28 Thread Hillf Danton
On Wednesday, December 28, 2016 11:30 PM Michal Hocko wrote:
> From: Michal Hocko 
> 
> mm_vmscan_lru_isolate shows the number of requested, scanned and taken
> pages. This is mostly OK but on 32b systems the number of scanned pages
> is quite misleading because it includes both the scanned and skipped
> pages.  Moreover the skipped part is scaled based on the number of taken
> pages. Let's report the exact numbers without any additional logic and
> add the number of skipped pages. This should make the reported data much
> more easier to interpret.
> 
> Signed-off-by: Michal Hocko 
> ---
Acked-by: Hillf Danton  



[PATCH V2] mm: fix typo of cache_alloc_zspage()

2016-12-28 Thread Xishi Qiu
Delete extra semicolon, it was introduced in
3783689 zsmalloc: introduce zspage structure

Signed-off-by: Xishi Qiu 
---
 mm/zsmalloc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 9cc3c0b..2d6c92e 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -364,7 +364,7 @@ static struct zspage *cache_alloc_zspage(struct zs_pool 
*pool, gfp_t flags)
 {
return kmem_cache_alloc(pool->zspage_cachep,
flags & ~(__GFP_HIGHMEM|__GFP_MOVABLE));
-};
+}
 
 static void cache_free_zspage(struct zs_pool *pool, struct zspage *zspage)
 {
-- 
1.8.3.1




[PATCH V2] mm: fix typo of cache_alloc_zspage()

2016-12-28 Thread Xishi Qiu
Delete extra semicolon, it was introduced in
3783689 zsmalloc: introduce zspage structure

Signed-off-by: Xishi Qiu 
---
 mm/zsmalloc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 9cc3c0b..2d6c92e 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -364,7 +364,7 @@ static struct zspage *cache_alloc_zspage(struct zs_pool 
*pool, gfp_t flags)
 {
return kmem_cache_alloc(pool->zspage_cachep,
flags & ~(__GFP_HIGHMEM|__GFP_MOVABLE));
-};
+}
 
 static void cache_free_zspage(struct zs_pool *pool, struct zspage *zspage)
 {
-- 
1.8.3.1




Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint

2016-12-28 Thread Michal Hocko
On Thu 29-12-16 14:33:59, Minchan Kim wrote:
> On Wed, Dec 28, 2016 at 04:30:27PM +0100, Michal Hocko wrote:
> > From: Michal Hocko 
> > 
> > Our reclaim process has several tracepoints to tell us more about how
> > things are progressing. We are, however, missing a tracepoint to track
> > active list aging. Introduce mm_vmscan_lru_shrink_active which reports
> > the number of scanned, rotated, deactivated and freed pages from the
> > particular node's active list.
> > 
> > Signed-off-by: Michal Hocko 
> > ---
> >  include/linux/gfp.h   |  2 +-
> >  include/trace/events/vmscan.h | 38 ++
> >  mm/page_alloc.c   |  6 +-
> >  mm/vmscan.c   | 22 +-
> >  4 files changed, 61 insertions(+), 7 deletions(-)
> > 
> > diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> > index 4175dca4ac39..61aa9b49e86d 100644
> > --- a/include/linux/gfp.h
> > +++ b/include/linux/gfp.h
> > @@ -503,7 +503,7 @@ void * __meminit alloc_pages_exact_nid(int nid, size_t 
> > size, gfp_t gfp_mask);
> >  extern void __free_pages(struct page *page, unsigned int order);
> >  extern void free_pages(unsigned long addr, unsigned int order);
> >  extern void free_hot_cold_page(struct page *page, bool cold);
> > -extern void free_hot_cold_page_list(struct list_head *list, bool cold);
> > +extern int free_hot_cold_page_list(struct list_head *list, bool cold);
> >  
> >  struct page_frag_cache;
> >  extern void __page_frag_drain(struct page *page, unsigned int order,
> > diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
> > index 39bad8921ca1..d34cc0ced2be 100644
> > --- a/include/trace/events/vmscan.h
> > +++ b/include/trace/events/vmscan.h
> > @@ -363,6 +363,44 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
> > show_reclaim_flags(__entry->reclaim_flags))
> >  );
> >  
> > +TRACE_EVENT(mm_vmscan_lru_shrink_active,
> > +
> > +   TP_PROTO(int nid, unsigned long nr_scanned, unsigned long nr_freed,
> > +   unsigned long nr_unevictable, unsigned long nr_deactivated,
> > +   unsigned long nr_rotated, int priority, int file),
> > +
> > +   TP_ARGS(nid, nr_scanned, nr_freed, nr_unevictable, nr_deactivated, 
> > nr_rotated, priority, file),
> 
> I agree it is helpful. And it was when I investigated aging problem of 32bit
> when node-lru was introduced. However, the question is we really need all 
> those
> kinds of information? just enough with nr_taken, nr_deactivated, priority, 
> file?

Dunno. Is it harmful to add this information? I like it more when the
numbers just add up and you have a clear picture. You never know what
might be useful when debugging a weird behavior. 

[...]
> > -   move_active_pages_to_lru(lruvec, _active, _hold, lru);
> > -   move_active_pages_to_lru(lruvec, _inactive, _hold, lru - 
> > LRU_ACTIVE);
> > +   nr_activate = move_active_pages_to_lru(lruvec, _active, _hold, lru);
> 
> Who use nr_active in here?

this is an omission. I just forgot to add it... Thanks for noticing.

-- 
Michal Hocko
SUSE Labs


Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint

2016-12-28 Thread Michal Hocko
On Thu 29-12-16 14:33:59, Minchan Kim wrote:
> On Wed, Dec 28, 2016 at 04:30:27PM +0100, Michal Hocko wrote:
> > From: Michal Hocko 
> > 
> > Our reclaim process has several tracepoints to tell us more about how
> > things are progressing. We are, however, missing a tracepoint to track
> > active list aging. Introduce mm_vmscan_lru_shrink_active which reports
> > the number of scanned, rotated, deactivated and freed pages from the
> > particular node's active list.
> > 
> > Signed-off-by: Michal Hocko 
> > ---
> >  include/linux/gfp.h   |  2 +-
> >  include/trace/events/vmscan.h | 38 ++
> >  mm/page_alloc.c   |  6 +-
> >  mm/vmscan.c   | 22 +-
> >  4 files changed, 61 insertions(+), 7 deletions(-)
> > 
> > diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> > index 4175dca4ac39..61aa9b49e86d 100644
> > --- a/include/linux/gfp.h
> > +++ b/include/linux/gfp.h
> > @@ -503,7 +503,7 @@ void * __meminit alloc_pages_exact_nid(int nid, size_t 
> > size, gfp_t gfp_mask);
> >  extern void __free_pages(struct page *page, unsigned int order);
> >  extern void free_pages(unsigned long addr, unsigned int order);
> >  extern void free_hot_cold_page(struct page *page, bool cold);
> > -extern void free_hot_cold_page_list(struct list_head *list, bool cold);
> > +extern int free_hot_cold_page_list(struct list_head *list, bool cold);
> >  
> >  struct page_frag_cache;
> >  extern void __page_frag_drain(struct page *page, unsigned int order,
> > diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
> > index 39bad8921ca1..d34cc0ced2be 100644
> > --- a/include/trace/events/vmscan.h
> > +++ b/include/trace/events/vmscan.h
> > @@ -363,6 +363,44 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
> > show_reclaim_flags(__entry->reclaim_flags))
> >  );
> >  
> > +TRACE_EVENT(mm_vmscan_lru_shrink_active,
> > +
> > +   TP_PROTO(int nid, unsigned long nr_scanned, unsigned long nr_freed,
> > +   unsigned long nr_unevictable, unsigned long nr_deactivated,
> > +   unsigned long nr_rotated, int priority, int file),
> > +
> > +   TP_ARGS(nid, nr_scanned, nr_freed, nr_unevictable, nr_deactivated, 
> > nr_rotated, priority, file),
> 
> I agree it is helpful. And it was when I investigated aging problem of 32bit
> when node-lru was introduced. However, the question is we really need all 
> those
> kinds of information? just enough with nr_taken, nr_deactivated, priority, 
> file?

Dunno. Is it harmful to add this information? I like it more when the
numbers just add up and you have a clear picture. You never know what
might be useful when debugging a weird behavior. 

[...]
> > -   move_active_pages_to_lru(lruvec, _active, _hold, lru);
> > -   move_active_pages_to_lru(lruvec, _inactive, _hold, lru - 
> > LRU_ACTIVE);
> > +   nr_activate = move_active_pages_to_lru(lruvec, _active, _hold, lru);
> 
> Who use nr_active in here?

this is an omission. I just forgot to add it... Thanks for noticing.

-- 
Michal Hocko
SUSE Labs


Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint

2016-12-28 Thread Hillf Danton

On Wednesday, December 28, 2016 11:30 PM Michal Hocko wrote:
> From: Michal Hocko 
> 
> Our reclaim process has several tracepoints to tell us more about how
> things are progressing. We are, however, missing a tracepoint to track
> active list aging. Introduce mm_vmscan_lru_shrink_active which reports
> the number of scanned, rotated, deactivated and freed pages from the
> particular node's active list.
> 
> Signed-off-by: Michal Hocko 
> ---
Acked-by: Hillf Danton  




Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint

2016-12-28 Thread Hillf Danton

On Wednesday, December 28, 2016 11:30 PM Michal Hocko wrote:
> From: Michal Hocko 
> 
> Our reclaim process has several tracepoints to tell us more about how
> things are progressing. We are, however, missing a tracepoint to track
> active list aging. Introduce mm_vmscan_lru_shrink_active which reports
> the number of scanned, rotated, deactivated and freed pages from the
> particular node's active list.
> 
> Signed-off-by: Michal Hocko 
> ---
Acked-by: Hillf Danton  




Re: [PATCH] Revert "mmc: dw_mmc-rockchip: add runtime PM support"

2016-12-28 Thread Jaehoon Chung
On 12/29/2016 12:02 PM, Jaehoon Chung wrote:
> Hi Randy,
> 
> On 12/29/2016 12:34 AM, Randy Li wrote:
>> This reverts commit f90142683f04bcb0729bf0df67a5e29562b725b9.
>> It is reported that making RK3288 can't boot from eMMC/MMC.
> 
> Could you explain in more detail?
> As you mentioned, this patch is making that RK3288 can't boot..then why?
> Good way should be that finds the main reason and fixes it.
> Not just revert.

To Shawn,

Could you check this? If you have rk3288..
If it's not working fine, it needs to revert this patch until finding the 
problem.

Best Regards,
Jaehoon Chung

> 
> Best Regards,
> Jaehoon Chung
> 
>>
>> Signed-off-by: Randy Li 
>> ---
>>  drivers/mmc/host/dw_mmc-rockchip.c | 41 
>> +++---
>>  1 file changed, 3 insertions(+), 38 deletions(-)
>>
>> diff --git a/drivers/mmc/host/dw_mmc-rockchip.c 
>> b/drivers/mmc/host/dw_mmc-rockchip.c
>> index 9a46e46..3189234 100644
>> --- a/drivers/mmc/host/dw_mmc-rockchip.c
>> +++ b/drivers/mmc/host/dw_mmc-rockchip.c
>> @@ -14,7 +14,6 @@
>>  #include 
>>  #include 
>>  #include 
>> -#include 
>>  #include 
>>  
>>  #include "dw_mmc.h"
>> @@ -327,7 +326,6 @@ static int dw_mci_rockchip_probe(struct platform_device 
>> *pdev)
>>  {
>>  const struct dw_mci_drv_data *drv_data;
>>  const struct of_device_id *match;
>> -int ret;
>>  
>>  if (!pdev->dev.of_node)
>>  return -ENODEV;
>> @@ -335,49 +333,16 @@ static int dw_mci_rockchip_probe(struct 
>> platform_device *pdev)
>>  match = of_match_node(dw_mci_rockchip_match, pdev->dev.of_node);
>>  drv_data = match->data;
>>  
>> -pm_runtime_get_noresume(>dev);
>> -pm_runtime_set_active(>dev);
>> -pm_runtime_enable(>dev);
>> -pm_runtime_set_autosuspend_delay(>dev, 50);
>> -pm_runtime_use_autosuspend(>dev);
>> -
>> -ret = dw_mci_pltfm_register(pdev, drv_data);
>> -if (ret) {
>> -pm_runtime_disable(>dev);
>> -pm_runtime_set_suspended(>dev);
>> -pm_runtime_put_noidle(>dev);
>> -return ret;
>> -}
>> -
>> -pm_runtime_put_autosuspend(>dev);
>> -
>> -return 0;
>> +return dw_mci_pltfm_register(pdev, drv_data);
>>  }
>>  
>> -static int dw_mci_rockchip_remove(struct platform_device *pdev)
>> -{
>> -pm_runtime_get_sync(>dev);
>> -pm_runtime_disable(>dev);
>> -pm_runtime_put_noidle(>dev);
>> -
>> -return dw_mci_pltfm_remove(pdev);
>> -}
>> -
>> -static const struct dev_pm_ops dw_mci_rockchip_dev_pm_ops = {
>> -SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
>> -pm_runtime_force_resume)
>> -SET_RUNTIME_PM_OPS(dw_mci_runtime_suspend,
>> -   dw_mci_runtime_resume,
>> -   NULL)
>> -};
>> -
>>  static struct platform_driver dw_mci_rockchip_pltfm_driver = {
>>  .probe  = dw_mci_rockchip_probe,
>> -.remove = dw_mci_rockchip_remove,
>> +.remove = dw_mci_pltfm_remove,
>>  .driver = {
>>  .name   = "dwmmc_rockchip",
>>  .of_match_table = dw_mci_rockchip_match,
>> -.pm = _mci_rockchip_dev_pm_ops,
>> +.pm = _mci_pltfm_pmops,
>>  },
>>  };
>>  
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> .
> 



Re: [PATCH] Revert "mmc: dw_mmc-rockchip: add runtime PM support"

2016-12-28 Thread Jaehoon Chung
On 12/29/2016 12:02 PM, Jaehoon Chung wrote:
> Hi Randy,
> 
> On 12/29/2016 12:34 AM, Randy Li wrote:
>> This reverts commit f90142683f04bcb0729bf0df67a5e29562b725b9.
>> It is reported that making RK3288 can't boot from eMMC/MMC.
> 
> Could you explain in more detail?
> As you mentioned, this patch is making that RK3288 can't boot..then why?
> Good way should be that finds the main reason and fixes it.
> Not just revert.

To Shawn,

Could you check this? If you have rk3288..
If it's not working fine, it needs to revert this patch until finding the 
problem.

Best Regards,
Jaehoon Chung

> 
> Best Regards,
> Jaehoon Chung
> 
>>
>> Signed-off-by: Randy Li 
>> ---
>>  drivers/mmc/host/dw_mmc-rockchip.c | 41 
>> +++---
>>  1 file changed, 3 insertions(+), 38 deletions(-)
>>
>> diff --git a/drivers/mmc/host/dw_mmc-rockchip.c 
>> b/drivers/mmc/host/dw_mmc-rockchip.c
>> index 9a46e46..3189234 100644
>> --- a/drivers/mmc/host/dw_mmc-rockchip.c
>> +++ b/drivers/mmc/host/dw_mmc-rockchip.c
>> @@ -14,7 +14,6 @@
>>  #include 
>>  #include 
>>  #include 
>> -#include 
>>  #include 
>>  
>>  #include "dw_mmc.h"
>> @@ -327,7 +326,6 @@ static int dw_mci_rockchip_probe(struct platform_device 
>> *pdev)
>>  {
>>  const struct dw_mci_drv_data *drv_data;
>>  const struct of_device_id *match;
>> -int ret;
>>  
>>  if (!pdev->dev.of_node)
>>  return -ENODEV;
>> @@ -335,49 +333,16 @@ static int dw_mci_rockchip_probe(struct 
>> platform_device *pdev)
>>  match = of_match_node(dw_mci_rockchip_match, pdev->dev.of_node);
>>  drv_data = match->data;
>>  
>> -pm_runtime_get_noresume(>dev);
>> -pm_runtime_set_active(>dev);
>> -pm_runtime_enable(>dev);
>> -pm_runtime_set_autosuspend_delay(>dev, 50);
>> -pm_runtime_use_autosuspend(>dev);
>> -
>> -ret = dw_mci_pltfm_register(pdev, drv_data);
>> -if (ret) {
>> -pm_runtime_disable(>dev);
>> -pm_runtime_set_suspended(>dev);
>> -pm_runtime_put_noidle(>dev);
>> -return ret;
>> -}
>> -
>> -pm_runtime_put_autosuspend(>dev);
>> -
>> -return 0;
>> +return dw_mci_pltfm_register(pdev, drv_data);
>>  }
>>  
>> -static int dw_mci_rockchip_remove(struct platform_device *pdev)
>> -{
>> -pm_runtime_get_sync(>dev);
>> -pm_runtime_disable(>dev);
>> -pm_runtime_put_noidle(>dev);
>> -
>> -return dw_mci_pltfm_remove(pdev);
>> -}
>> -
>> -static const struct dev_pm_ops dw_mci_rockchip_dev_pm_ops = {
>> -SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
>> -pm_runtime_force_resume)
>> -SET_RUNTIME_PM_OPS(dw_mci_runtime_suspend,
>> -   dw_mci_runtime_resume,
>> -   NULL)
>> -};
>> -
>>  static struct platform_driver dw_mci_rockchip_pltfm_driver = {
>>  .probe  = dw_mci_rockchip_probe,
>> -.remove = dw_mci_rockchip_remove,
>> +.remove = dw_mci_pltfm_remove,
>>  .driver = {
>>  .name   = "dwmmc_rockchip",
>>  .of_match_table = dw_mci_rockchip_match,
>> -.pm = _mci_rockchip_dev_pm_ops,
>> +.pm = _mci_pltfm_pmops,
>>  },
>>  };
>>  
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> .
> 



Re: [PATCH v3 4/4] phy: qcom-qmp: new qmp phy driver for qcom-chipsets

2016-12-28 Thread Vivek Gautam
Hi Stephen,


On Thu, Dec 29, 2016 at 4:46 AM, Stephen Boyd  wrote:
> On 12/20, Vivek Gautam wrote:
>> Qualcomm SOCs have QMP phy controller that provides support
>> to a number of controller, viz. PCIe, UFS, and USB.
>> Add a new driver, based on generic phy framework, for this
>> phy controller.
>>
>> Signed-off-by: Vivek Gautam 
>> Tested-by: Srinivas Kandagatla 
>> ---
>>
>> +
>> +static struct phy *qcom_qmp_phy_xlate(struct device *dev,
>> + struct of_phandle_args *args)
>> +{
>> + struct qcom_qmp_phy *qphy = dev_get_drvdata(dev);
>> + int i;
>> +
>> + if (WARN_ON(args->args[0] >= qphy->cfg->nlanes))
>> + return ERR_PTR(-ENODEV);
>> +
>> + for (i = 0; i < qphy->cfg->nlanes; i++)
>> + /* phys[i]->index */
>> + if (i == args->args[0])
>> + return qphy->phys[i]->phy;
>
> What's the loop for? If args->arg[0] < qphy->cfg->nlanes then we
> should be able to directly index the qphy->phys array with that
> number and return it.

Right, will do that.

>
>> +
>> + return ERR_PTR(-ENODEV);
>> +}
>> +
> [...]
>> +
>> +/*
>> + * The _pipe_clksrc generated by PHY goes to the GCC that gate
>> + * controls it. The _pipe_clk coming out of the GCC is requested
>> + * by the PHY driver for its operations.
>> + * We register the _pipe_clksrc here. The gcc driver takes care
>> + * of assigning this _pipe_clksrc as parent to _pipe_clk.
>> + * Below picture shows this relationship.
>> + *
>> + *  +--+
>> + *  |  PHY block   |<<---+
>> + *  |  | |
>> + *  |   +---+  |   +-+   |
>> + *   I/P---^-->|  PLL  |--^--->pipe_clksrc--->| GCC |--->pipe_clk---+
>> + *   clk   |   +---+  |+-+
>> + *  +--+
>
> There are mixed tabs and spaces in this diagram causing
> confusion in my editor. Please make it only spaces so the picture
> comes out correctly.

Sure, will do that.

>
>> + *
>> + */
>> +static int phy_pipe_clk_register(struct qcom_qmp_phy *qphy, int id)
>> +{
>> + char clk_name[MAX_PROP_NAME];
>
> I'm not sure MAX_PROP_NAME is the same as some max clk name but
> ok. We should be able to calculate that the maximum is length of
> usb3_phy_pipe_clk_src for now though?

Yea, i thought of using the same macro, considering that it provides
32 characters :-)
Will rather use the length of usb3_phy_pipe_clk_src for now. May be
#define MAX_CLK_NAME   24

>
>> + struct clk *clk;
>> +
>> + memset(_name, 0, sizeof(clk_name));
>> + switch (qphy->cfg->type) {
>> + case PHY_TYPE_USB3:
>> + snprintf(clk_name, MAX_PROP_NAME, "usb3_phy_pipe_clk_src");
>> + break;
>> + case PHY_TYPE_PCIE:
>> + snprintf(clk_name, MAX_PROP_NAME, "pcie_%d_pipe_clk_src", id);
>> + break;
>> + default:
>> + return -EINVAL;
>> + }
>> +
>> + /* controllers using QMP phys use 125MHz pipe clock interface */
>> + clk = clk_register_fixed_rate(qphy->dev, clk_name, NULL, 0, 12500);
>
> I was hoping you would be able to calculate the actual output
> rate by reading hardware. This is ok too though.

Yea, I too was looking to understand the phy registers needed to
calculate and re-calibrate the pipe clock rate, but couldn't find much
from the IP programming guide. So, I had to fall back to registering
a fixed-rate clock, since we are sure that the pipe clock rate is fixed at
125 MHz for the controllers using pipe interface.
Once we find out the required information, we can as well register clk_ops
for this clock.

> Just please use
> clk_hw_register_fixed_rate() instead. And you'll probably need
> some sort of devm() usage here to handle probe failure, so I
> would probably roll my own and allocate a fixed_rate clk
> structure and set the rate/name directly and then call
> devm_clk_hw_register().

Sure, will do that.

>
>> +
>> + return PTR_ERR_OR_ZERO(clk);
> --
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> a Linux Foundation Collaborative Project


Thanks
Vivek
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH v3 4/4] phy: qcom-qmp: new qmp phy driver for qcom-chipsets

2016-12-28 Thread Vivek Gautam
Hi Stephen,


On Thu, Dec 29, 2016 at 4:46 AM, Stephen Boyd  wrote:
> On 12/20, Vivek Gautam wrote:
>> Qualcomm SOCs have QMP phy controller that provides support
>> to a number of controller, viz. PCIe, UFS, and USB.
>> Add a new driver, based on generic phy framework, for this
>> phy controller.
>>
>> Signed-off-by: Vivek Gautam 
>> Tested-by: Srinivas Kandagatla 
>> ---
>>
>> +
>> +static struct phy *qcom_qmp_phy_xlate(struct device *dev,
>> + struct of_phandle_args *args)
>> +{
>> + struct qcom_qmp_phy *qphy = dev_get_drvdata(dev);
>> + int i;
>> +
>> + if (WARN_ON(args->args[0] >= qphy->cfg->nlanes))
>> + return ERR_PTR(-ENODEV);
>> +
>> + for (i = 0; i < qphy->cfg->nlanes; i++)
>> + /* phys[i]->index */
>> + if (i == args->args[0])
>> + return qphy->phys[i]->phy;
>
> What's the loop for? If args->arg[0] < qphy->cfg->nlanes then we
> should be able to directly index the qphy->phys array with that
> number and return it.

Right, will do that.

>
>> +
>> + return ERR_PTR(-ENODEV);
>> +}
>> +
> [...]
>> +
>> +/*
>> + * The _pipe_clksrc generated by PHY goes to the GCC that gate
>> + * controls it. The _pipe_clk coming out of the GCC is requested
>> + * by the PHY driver for its operations.
>> + * We register the _pipe_clksrc here. The gcc driver takes care
>> + * of assigning this _pipe_clksrc as parent to _pipe_clk.
>> + * Below picture shows this relationship.
>> + *
>> + *  +--+
>> + *  |  PHY block   |<<---+
>> + *  |  | |
>> + *  |   +---+  |   +-+   |
>> + *   I/P---^-->|  PLL  |--^--->pipe_clksrc--->| GCC |--->pipe_clk---+
>> + *   clk   |   +---+  |+-+
>> + *  +--+
>
> There are mixed tabs and spaces in this diagram causing
> confusion in my editor. Please make it only spaces so the picture
> comes out correctly.

Sure, will do that.

>
>> + *
>> + */
>> +static int phy_pipe_clk_register(struct qcom_qmp_phy *qphy, int id)
>> +{
>> + char clk_name[MAX_PROP_NAME];
>
> I'm not sure MAX_PROP_NAME is the same as some max clk name but
> ok. We should be able to calculate that the maximum is length of
> usb3_phy_pipe_clk_src for now though?

Yea, i thought of using the same macro, considering that it provides
32 characters :-)
Will rather use the length of usb3_phy_pipe_clk_src for now. May be
#define MAX_CLK_NAME   24

>
>> + struct clk *clk;
>> +
>> + memset(_name, 0, sizeof(clk_name));
>> + switch (qphy->cfg->type) {
>> + case PHY_TYPE_USB3:
>> + snprintf(clk_name, MAX_PROP_NAME, "usb3_phy_pipe_clk_src");
>> + break;
>> + case PHY_TYPE_PCIE:
>> + snprintf(clk_name, MAX_PROP_NAME, "pcie_%d_pipe_clk_src", id);
>> + break;
>> + default:
>> + return -EINVAL;
>> + }
>> +
>> + /* controllers using QMP phys use 125MHz pipe clock interface */
>> + clk = clk_register_fixed_rate(qphy->dev, clk_name, NULL, 0, 12500);
>
> I was hoping you would be able to calculate the actual output
> rate by reading hardware. This is ok too though.

Yea, I too was looking to understand the phy registers needed to
calculate and re-calibrate the pipe clock rate, but couldn't find much
from the IP programming guide. So, I had to fall back to registering
a fixed-rate clock, since we are sure that the pipe clock rate is fixed at
125 MHz for the controllers using pipe interface.
Once we find out the required information, we can as well register clk_ops
for this clock.

> Just please use
> clk_hw_register_fixed_rate() instead. And you'll probably need
> some sort of devm() usage here to handle probe failure, so I
> would probably roll my own and allocate a fixed_rate clk
> structure and set the rate/name directly and then call
> devm_clk_hw_register().

Sure, will do that.

>
>> +
>> + return PTR_ERR_OR_ZERO(clk);
> --
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> a Linux Foundation Collaborative Project


Thanks
Vivek
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH 1/7] mm, vmscan: remove unused mm_vmscan_memcg_isolate

2016-12-28 Thread Hillf Danton

On Wednesday, December 28, 2016 11:30 PM Michal Hocko wrote:
> From: Michal Hocko 
> 
> the trace point is not used since 925b7673cce3 ("mm: make per-memcg LRU
> lists exclusive") so it can be removed.
> 
> Signed-off-by: Michal Hocko 
> ---
Acked-by: Hillf Danton  




Re: [PATCH 1/7] mm, vmscan: remove unused mm_vmscan_memcg_isolate

2016-12-28 Thread Hillf Danton

On Wednesday, December 28, 2016 11:30 PM Michal Hocko wrote:
> From: Michal Hocko 
> 
> the trace point is not used since 925b7673cce3 ("mm: make per-memcg LRU
> lists exclusive") so it can be removed.
> 
> Signed-off-by: Michal Hocko 
> ---
Acked-by: Hillf Danton  




Re: mm: fix typo of cache_alloc_zspage()

2016-12-28 Thread Sergey Senozhatsky
Hello,

On (12/29/16 15:59), Minchan Kim wrote:
[..]
> > I don't know... do we want to have it as a separate patch?
> > may be we can fold it into some other patch someday later.
> 
> Xishi spent his time to make the patch(review,create/send). And I want to
> give a credit to him. :)

sure, I didn't mean "let's seize the credit" :)  my reasoning was
that that patch hardly can be counted even as trivial. per
documentation:

: Trivial patches must qualify for one of the following rules:
:
: - Spelling fixes in documentation
: - Spelling fixes for errors which could break :manpage:`grep(1)`
: - Warning fixes (cluttering with useless warnings is bad)
: - Compilation fixes (only if they are actually correct)
: - Runtime fixes (only if they actually fix things)
: - Removing use of deprecated functions/macros
: - Contact detail and documentation fixes
: - Non-portable code replaced by portable code (even in arch-specific,
:   since people copy, as long as it's trivial)
: - Any fix by the author/maintainer of the file (ie. patch monkey
:   in re-transmission mode)


hence was my question. we can have it as "p.s. in this patch we also
remove XYZ reported by Xishi Qiu".

but up to you.



for instance, we can have Xishi's fix up as part of this "fix documentation
typos" patch. which can be counted in as trivial.


---

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 9cc3c0b2c2c1..af7cd90c26f7 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -25,7 +25,7 @@
  * Usage of struct page flags:
  * PG_private: identifies the first component page
  * PG_private2: identifies the last component page
- * PG_owner_priv_1: indentifies the huge component page
+ * PG_owner_priv_1: identifies the huge component page
  *
  */
 
@@ -65,7 +65,7 @@
 #define ZS_ALIGN   8
 
 /*
- * A single 'zspage' is composed of up to 2^N discontiguous 0-order (single)
+ * A single 'zspage' is composed of up to 2^N discontinuous 0-order (single)
  * pages. ZS_MAX_ZSPAGE_ORDER defines upper limit on N.
  */
 #define ZS_MAX_ZSPAGE_ORDER 2
@@ -2383,7 +2383,7 @@ struct zs_pool *zs_create_pool(const char *name)
goto err;
 
/*
-* Iterate reversly, because, size of size_class that we want to use
+* Iterate reversely, because, size of size_class that we want to use
 * for merging should be larger or equal to current size.
 */
for (i = zs_size_classes - 1; i >= 0; i--) {


---

-ss


Re: mm: fix typo of cache_alloc_zspage()

2016-12-28 Thread Sergey Senozhatsky
Hello,

On (12/29/16 15:59), Minchan Kim wrote:
[..]
> > I don't know... do we want to have it as a separate patch?
> > may be we can fold it into some other patch someday later.
> 
> Xishi spent his time to make the patch(review,create/send). And I want to
> give a credit to him. :)

sure, I didn't mean "let's seize the credit" :)  my reasoning was
that that patch hardly can be counted even as trivial. per
documentation:

: Trivial patches must qualify for one of the following rules:
:
: - Spelling fixes in documentation
: - Spelling fixes for errors which could break :manpage:`grep(1)`
: - Warning fixes (cluttering with useless warnings is bad)
: - Compilation fixes (only if they are actually correct)
: - Runtime fixes (only if they actually fix things)
: - Removing use of deprecated functions/macros
: - Contact detail and documentation fixes
: - Non-portable code replaced by portable code (even in arch-specific,
:   since people copy, as long as it's trivial)
: - Any fix by the author/maintainer of the file (ie. patch monkey
:   in re-transmission mode)


hence was my question. we can have it as "p.s. in this patch we also
remove XYZ reported by Xishi Qiu".

but up to you.



for instance, we can have Xishi's fix up as part of this "fix documentation
typos" patch. which can be counted in as trivial.


---

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 9cc3c0b2c2c1..af7cd90c26f7 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -25,7 +25,7 @@
  * Usage of struct page flags:
  * PG_private: identifies the first component page
  * PG_private2: identifies the last component page
- * PG_owner_priv_1: indentifies the huge component page
+ * PG_owner_priv_1: identifies the huge component page
  *
  */
 
@@ -65,7 +65,7 @@
 #define ZS_ALIGN   8
 
 /*
- * A single 'zspage' is composed of up to 2^N discontiguous 0-order (single)
+ * A single 'zspage' is composed of up to 2^N discontinuous 0-order (single)
  * pages. ZS_MAX_ZSPAGE_ORDER defines upper limit on N.
  */
 #define ZS_MAX_ZSPAGE_ORDER 2
@@ -2383,7 +2383,7 @@ struct zs_pool *zs_create_pool(const char *name)
goto err;
 
/*
-* Iterate reversly, because, size of size_class that we want to use
+* Iterate reversely, because, size of size_class that we want to use
 * for merging should be larger or equal to current size.
 */
for (i = zs_size_classes - 1; i >= 0; i--) {


---

-ss


Re: [PATCH v3 RESEND 02/11] pwm: imx: remove ipg clock

2016-12-28 Thread Stefan Agner
On 2016-12-28 23:01, Lukasz Majewski wrote:
> Hi Stefan,
> 
>> Hi Stefan,
>>
>> > On 2016-12-26 23:55, Lukasz Majewski wrote:
>> > > From: Sascha Hauer 
>> > >
>> > > The use of the ipg clock was introduced with commit 7b27c160c681
>> > > ("pwm: i.MX: fix clock lookup").
>> > > In the commit message it was claimed that the ipg clock is enabled
>> > > for register accesses. This is true for the ->config() callback,
>> > > but not for the ->set_enable() callback. Given that the ipg clock
>> > > is not consistently enabled for all register accesses we can
>> > > assume that either it is not required at all or that the current
>> > > code does not work. Remove the ipg clock code for now so that
>> > > it's no longer in the way of refactoring the driver.
>> >
>> > Hi Lukasz,
>> >
>> > Has my concern addressed in any way with this resend?
>> > https://lkml.org/lkml/2016/11/22/729
>>
>> Unfortunately not, since I don't have iMX7 for testing.
>>
>> >
>> > Breaking hardware is usually not an option :-)
>>
>> Yes, I know, but
>>
>> Please look on the patch set from my perspective:
>>
>> I originally wanted to add polarity inversion to PWM. Then, there was
>> the request from you and Boris to go with "atomicity" support, so I
>> converted the driver to support it.
>>
>> This patch set has been resent on purpose at the end of merge window,
>> so we do have some time to fix it if it would be accepted to -next
>> tree (or any other PWM related one). Moreover, the burden for
>> preparing patches would be smaller - since we all have agreed that
>> "atomicity" is a more than welcome feature.
>>
>>
>> >
>> > I checked the i.MX 7 reference manual again, and in this case the
>> > peripheral access clock is a clock line named "ipg_clk_s" (Table
>> > 12-20), with a clock root "PWM1_CLK_ROOT" (Table 5-12). In i.MX 7
>> > all clocks are behind a single gate, so in fact it does not matter
>> > which clock we take. Given that others have peripheral access
>> > behind the "pwm" gate, I guess we should take the "pwm" gate...
>>
>>
>> If possible please prepare a patch. It would be the best solution.
> 
> If I might ask - are you willing to prepare patch to fix iMX7 or shall
> I roll back to the ipg code already present in main line ?

I doubt that just rolling back the existing code works for the new
atomic change... I guess we have to really introduce a clk enable on
each register access, also for the new atomic code.

Not sure if we should fold this change in a existing commit or create a
new patch ontop of your changes. The latter is probably easier, but
creates a "window of brokenness" for i.MX 7 in git history. But then, I
don't really mind since it currently works more or less by chance...

I can prepare a patch.

--
Stefan

> 
> 
> Best regards,
> Łukasz Majewski
> 
>>
>> Thanks in advance,
>> Łukasz Majewski
>>
>> >
>> > --
>> > Stefan
>> >
>> > >
>> > > Signed-off-by: Sascha Hauer 
>> > > Cc: Philipp Zabel 
>> > > ---
>> > > [commit message text refactored by Lukasz Majewski
>> > > ] ---
>> > > Changes for v3:
>> > > - New patch
>> > > ---
>> > >  drivers/pwm/pwm-imx.c | 19 +--
>> > >  1 file changed, 1 insertion(+), 18 deletions(-)
>> > >
>> > > diff --git a/drivers/pwm/pwm-imx.c b/drivers/pwm/pwm-imx.c
>> > > index d600fd5..70609ef2 100644
>> > > --- a/drivers/pwm/pwm-imx.c
>> > > +++ b/drivers/pwm/pwm-imx.c
>> > > @@ -49,7 +49,6 @@
>> > >
>> > >  struct imx_chip {
>> > >  struct clk  *clk_per;
>> > > -struct clk  *clk_ipg;
>> > >
>> > >  void __iomem*mmio_base;
>> > >
>> > > @@ -204,17 +203,8 @@ static int imx_pwm_config(struct pwm_chip
>> > > *chip, struct pwm_device *pwm, int duty_ns, int period_ns)
>> > >  {
>> > >  struct imx_chip *imx = to_imx_chip(chip);
>> > > -int ret;
>> > > -
>> > > -ret = clk_prepare_enable(imx->clk_ipg);
>> > > -if (ret)
>> > > -return ret;
>> > >
>> > > -ret = imx->config(chip, pwm, duty_ns, period_ns);
>> > > -
>> > > -clk_disable_unprepare(imx->clk_ipg);
>> > > -
>> > > -return ret;
>> > > +return imx->config(chip, pwm, duty_ns, period_ns);
>> > >  }
>> > >
>> > >  static int imx_pwm_enable(struct pwm_chip *chip, struct
>> > > pwm_device *pwm) @@ -293,13 +283,6 @@ static int
>> > > imx_pwm_probe(struct platform_device *pdev) return
>> > > PTR_ERR(imx->clk_per); }
>> > >
>> > > -imx->clk_ipg = devm_clk_get(>dev, "ipg");
>> > > -if (IS_ERR(imx->clk_ipg)) {
>> > > -dev_err(>dev, "getting ipg clock failed
>> > > with %ld\n",
>> > > -PTR_ERR(imx->clk_ipg));
>> > > -return PTR_ERR(imx->clk_ipg);
>> > > -}
>> > > -
>> > >  imx->chip.ops = _pwm_ops;
>> > >  imx->chip.dev = >dev;
>> > >  imx->chip.base = -1;
>>


Re: [PATCH v3 RESEND 02/11] pwm: imx: remove ipg clock

2016-12-28 Thread Stefan Agner
On 2016-12-28 23:01, Lukasz Majewski wrote:
> Hi Stefan,
> 
>> Hi Stefan,
>>
>> > On 2016-12-26 23:55, Lukasz Majewski wrote:
>> > > From: Sascha Hauer 
>> > >
>> > > The use of the ipg clock was introduced with commit 7b27c160c681
>> > > ("pwm: i.MX: fix clock lookup").
>> > > In the commit message it was claimed that the ipg clock is enabled
>> > > for register accesses. This is true for the ->config() callback,
>> > > but not for the ->set_enable() callback. Given that the ipg clock
>> > > is not consistently enabled for all register accesses we can
>> > > assume that either it is not required at all or that the current
>> > > code does not work. Remove the ipg clock code for now so that
>> > > it's no longer in the way of refactoring the driver.
>> >
>> > Hi Lukasz,
>> >
>> > Has my concern addressed in any way with this resend?
>> > https://lkml.org/lkml/2016/11/22/729
>>
>> Unfortunately not, since I don't have iMX7 for testing.
>>
>> >
>> > Breaking hardware is usually not an option :-)
>>
>> Yes, I know, but
>>
>> Please look on the patch set from my perspective:
>>
>> I originally wanted to add polarity inversion to PWM. Then, there was
>> the request from you and Boris to go with "atomicity" support, so I
>> converted the driver to support it.
>>
>> This patch set has been resent on purpose at the end of merge window,
>> so we do have some time to fix it if it would be accepted to -next
>> tree (or any other PWM related one). Moreover, the burden for
>> preparing patches would be smaller - since we all have agreed that
>> "atomicity" is a more than welcome feature.
>>
>>
>> >
>> > I checked the i.MX 7 reference manual again, and in this case the
>> > peripheral access clock is a clock line named "ipg_clk_s" (Table
>> > 12-20), with a clock root "PWM1_CLK_ROOT" (Table 5-12). In i.MX 7
>> > all clocks are behind a single gate, so in fact it does not matter
>> > which clock we take. Given that others have peripheral access
>> > behind the "pwm" gate, I guess we should take the "pwm" gate...
>>
>>
>> If possible please prepare a patch. It would be the best solution.
> 
> If I might ask - are you willing to prepare patch to fix iMX7 or shall
> I roll back to the ipg code already present in main line ?

I doubt that just rolling back the existing code works for the new
atomic change... I guess we have to really introduce a clk enable on
each register access, also for the new atomic code.

Not sure if we should fold this change in a existing commit or create a
new patch ontop of your changes. The latter is probably easier, but
creates a "window of brokenness" for i.MX 7 in git history. But then, I
don't really mind since it currently works more or less by chance...

I can prepare a patch.

--
Stefan

> 
> 
> Best regards,
> Łukasz Majewski
> 
>>
>> Thanks in advance,
>> Łukasz Majewski
>>
>> >
>> > --
>> > Stefan
>> >
>> > >
>> > > Signed-off-by: Sascha Hauer 
>> > > Cc: Philipp Zabel 
>> > > ---
>> > > [commit message text refactored by Lukasz Majewski
>> > > ] ---
>> > > Changes for v3:
>> > > - New patch
>> > > ---
>> > >  drivers/pwm/pwm-imx.c | 19 +--
>> > >  1 file changed, 1 insertion(+), 18 deletions(-)
>> > >
>> > > diff --git a/drivers/pwm/pwm-imx.c b/drivers/pwm/pwm-imx.c
>> > > index d600fd5..70609ef2 100644
>> > > --- a/drivers/pwm/pwm-imx.c
>> > > +++ b/drivers/pwm/pwm-imx.c
>> > > @@ -49,7 +49,6 @@
>> > >
>> > >  struct imx_chip {
>> > >  struct clk  *clk_per;
>> > > -struct clk  *clk_ipg;
>> > >
>> > >  void __iomem*mmio_base;
>> > >
>> > > @@ -204,17 +203,8 @@ static int imx_pwm_config(struct pwm_chip
>> > > *chip, struct pwm_device *pwm, int duty_ns, int period_ns)
>> > >  {
>> > >  struct imx_chip *imx = to_imx_chip(chip);
>> > > -int ret;
>> > > -
>> > > -ret = clk_prepare_enable(imx->clk_ipg);
>> > > -if (ret)
>> > > -return ret;
>> > >
>> > > -ret = imx->config(chip, pwm, duty_ns, period_ns);
>> > > -
>> > > -clk_disable_unprepare(imx->clk_ipg);
>> > > -
>> > > -return ret;
>> > > +return imx->config(chip, pwm, duty_ns, period_ns);
>> > >  }
>> > >
>> > >  static int imx_pwm_enable(struct pwm_chip *chip, struct
>> > > pwm_device *pwm) @@ -293,13 +283,6 @@ static int
>> > > imx_pwm_probe(struct platform_device *pdev) return
>> > > PTR_ERR(imx->clk_per); }
>> > >
>> > > -imx->clk_ipg = devm_clk_get(>dev, "ipg");
>> > > -if (IS_ERR(imx->clk_ipg)) {
>> > > -dev_err(>dev, "getting ipg clock failed
>> > > with %ld\n",
>> > > -PTR_ERR(imx->clk_ipg));
>> > > -return PTR_ERR(imx->clk_ipg);
>> > > -}
>> > > -
>> > >  imx->chip.ops = _pwm_ops;
>> > >  imx->chip.dev = >dev;
>> > >  imx->chip.base = -1;
>>


Re: [PATCH] Revert "mmc: dw_mmc-rockchip: add runtime PM support"

2016-12-28 Thread Shawn Lin

On 2016/12/29 15:13, Jaehoon Chung wrote:

On 12/29/2016 12:02 PM, Jaehoon Chung wrote:

Hi Randy,

On 12/29/2016 12:34 AM, Randy Li wrote:

This reverts commit f90142683f04bcb0729bf0df67a5e29562b725b9.
It is reported that making RK3288 can't boot from eMMC/MMC.


Could you explain in more detail?
As you mentioned, this patch is making that RK3288 can't boot..then why?
Good way should be that finds the main reason and fixes it.
Not just revert.


To Shawn,

Could you check this? If you have rk3288..
If it's not working fine, it needs to revert this patch until finding the 
problem.



Hrmm.as that patchset was tested based on rk3288 and rk3368, so I
need to know which board Randy are using now and could you share some
log?

I will have a look at it.



Best Regards,
Jaehoon Chung



Best Regards,
Jaehoon Chung



Signed-off-by: Randy Li 
---
 drivers/mmc/host/dw_mmc-rockchip.c | 41 +++---
 1 file changed, 3 insertions(+), 38 deletions(-)

diff --git a/drivers/mmc/host/dw_mmc-rockchip.c 
b/drivers/mmc/host/dw_mmc-rockchip.c
index 9a46e46..3189234 100644
--- a/drivers/mmc/host/dw_mmc-rockchip.c
+++ b/drivers/mmc/host/dw_mmc-rockchip.c
@@ -14,7 +14,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 

 #include "dw_mmc.h"
@@ -327,7 +326,6 @@ static int dw_mci_rockchip_probe(struct platform_device 
*pdev)
 {
const struct dw_mci_drv_data *drv_data;
const struct of_device_id *match;
-   int ret;

if (!pdev->dev.of_node)
return -ENODEV;
@@ -335,49 +333,16 @@ static int dw_mci_rockchip_probe(struct platform_device 
*pdev)
match = of_match_node(dw_mci_rockchip_match, pdev->dev.of_node);
drv_data = match->data;

-   pm_runtime_get_noresume(>dev);
-   pm_runtime_set_active(>dev);
-   pm_runtime_enable(>dev);
-   pm_runtime_set_autosuspend_delay(>dev, 50);
-   pm_runtime_use_autosuspend(>dev);
-
-   ret = dw_mci_pltfm_register(pdev, drv_data);
-   if (ret) {
-   pm_runtime_disable(>dev);
-   pm_runtime_set_suspended(>dev);
-   pm_runtime_put_noidle(>dev);
-   return ret;
-   }
-
-   pm_runtime_put_autosuspend(>dev);
-
-   return 0;
+   return dw_mci_pltfm_register(pdev, drv_data);
 }

-static int dw_mci_rockchip_remove(struct platform_device *pdev)
-{
-   pm_runtime_get_sync(>dev);
-   pm_runtime_disable(>dev);
-   pm_runtime_put_noidle(>dev);
-
-   return dw_mci_pltfm_remove(pdev);
-}
-
-static const struct dev_pm_ops dw_mci_rockchip_dev_pm_ops = {
-   SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
-   pm_runtime_force_resume)
-   SET_RUNTIME_PM_OPS(dw_mci_runtime_suspend,
-  dw_mci_runtime_resume,
-  NULL)
-};
-
 static struct platform_driver dw_mci_rockchip_pltfm_driver = {
.probe  = dw_mci_rockchip_probe,
-   .remove = dw_mci_rockchip_remove,
+   .remove = dw_mci_pltfm_remove,
.driver = {
.name   = "dwmmc_rockchip",
.of_match_table = dw_mci_rockchip_match,
-   .pm = _mci_rockchip_dev_pm_ops,
+   .pm = _mci_pltfm_pmops,
},
 };




--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

.









--
Best Regards
Shawn Lin



Re: [PATCH] Revert "mmc: dw_mmc-rockchip: add runtime PM support"

2016-12-28 Thread Shawn Lin

On 2016/12/29 15:13, Jaehoon Chung wrote:

On 12/29/2016 12:02 PM, Jaehoon Chung wrote:

Hi Randy,

On 12/29/2016 12:34 AM, Randy Li wrote:

This reverts commit f90142683f04bcb0729bf0df67a5e29562b725b9.
It is reported that making RK3288 can't boot from eMMC/MMC.


Could you explain in more detail?
As you mentioned, this patch is making that RK3288 can't boot..then why?
Good way should be that finds the main reason and fixes it.
Not just revert.


To Shawn,

Could you check this? If you have rk3288..
If it's not working fine, it needs to revert this patch until finding the 
problem.



Hrmm.as that patchset was tested based on rk3288 and rk3368, so I
need to know which board Randy are using now and could you share some
log?

I will have a look at it.



Best Regards,
Jaehoon Chung



Best Regards,
Jaehoon Chung



Signed-off-by: Randy Li 
---
 drivers/mmc/host/dw_mmc-rockchip.c | 41 +++---
 1 file changed, 3 insertions(+), 38 deletions(-)

diff --git a/drivers/mmc/host/dw_mmc-rockchip.c 
b/drivers/mmc/host/dw_mmc-rockchip.c
index 9a46e46..3189234 100644
--- a/drivers/mmc/host/dw_mmc-rockchip.c
+++ b/drivers/mmc/host/dw_mmc-rockchip.c
@@ -14,7 +14,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 

 #include "dw_mmc.h"
@@ -327,7 +326,6 @@ static int dw_mci_rockchip_probe(struct platform_device 
*pdev)
 {
const struct dw_mci_drv_data *drv_data;
const struct of_device_id *match;
-   int ret;

if (!pdev->dev.of_node)
return -ENODEV;
@@ -335,49 +333,16 @@ static int dw_mci_rockchip_probe(struct platform_device 
*pdev)
match = of_match_node(dw_mci_rockchip_match, pdev->dev.of_node);
drv_data = match->data;

-   pm_runtime_get_noresume(>dev);
-   pm_runtime_set_active(>dev);
-   pm_runtime_enable(>dev);
-   pm_runtime_set_autosuspend_delay(>dev, 50);
-   pm_runtime_use_autosuspend(>dev);
-
-   ret = dw_mci_pltfm_register(pdev, drv_data);
-   if (ret) {
-   pm_runtime_disable(>dev);
-   pm_runtime_set_suspended(>dev);
-   pm_runtime_put_noidle(>dev);
-   return ret;
-   }
-
-   pm_runtime_put_autosuspend(>dev);
-
-   return 0;
+   return dw_mci_pltfm_register(pdev, drv_data);
 }

-static int dw_mci_rockchip_remove(struct platform_device *pdev)
-{
-   pm_runtime_get_sync(>dev);
-   pm_runtime_disable(>dev);
-   pm_runtime_put_noidle(>dev);
-
-   return dw_mci_pltfm_remove(pdev);
-}
-
-static const struct dev_pm_ops dw_mci_rockchip_dev_pm_ops = {
-   SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
-   pm_runtime_force_resume)
-   SET_RUNTIME_PM_OPS(dw_mci_runtime_suspend,
-  dw_mci_runtime_resume,
-  NULL)
-};
-
 static struct platform_driver dw_mci_rockchip_pltfm_driver = {
.probe  = dw_mci_rockchip_probe,
-   .remove = dw_mci_rockchip_remove,
+   .remove = dw_mci_pltfm_remove,
.driver = {
.name   = "dwmmc_rockchip",
.of_match_table = dw_mci_rockchip_match,
-   .pm = _mci_rockchip_dev_pm_ops,
+   .pm = _mci_pltfm_pmops,
},
 };




--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

.









--
Best Regards
Shawn Lin



Re: [PATCH V1] pinctrl:pxa:pinctrl-pxa2xx:- No need of devm functions

2016-12-28 Thread Robert Jarzmik
Linus Walleij  writes:

> On Thu, Dec 8, 2016 at 3:35 PM, Arvind Yadav  
> wrote:
>
>> In functions pxa2xx_build_functions, the memory allocated for
>> 'functions' is live within the function only. After the
>> allocation it is immediately freed with devm_kfree. There is
>> no need to allocate memory for 'functions' with devm function
>> so replace devm_kcalloc  with kcalloc and devm_kfree with kfree.
>>
>> Signed-off-by: Arvind Yadav 
>
> I want the maintainer Robert Jarzmik to review this before I do anything

Hi Linus,

I did review, on December the 10th. I wasn't very enthusiastic about the patch,
if you check back my reply.

Cheers.

-- 
Robert


Re: [PATCH V1] pinctrl:pxa:pinctrl-pxa2xx:- No need of devm functions

2016-12-28 Thread Robert Jarzmik
Linus Walleij  writes:

> On Thu, Dec 8, 2016 at 3:35 PM, Arvind Yadav  
> wrote:
>
>> In functions pxa2xx_build_functions, the memory allocated for
>> 'functions' is live within the function only. After the
>> allocation it is immediately freed with devm_kfree. There is
>> no need to allocate memory for 'functions' with devm function
>> so replace devm_kcalloc  with kcalloc and devm_kfree with kfree.
>>
>> Signed-off-by: Arvind Yadav 
>
> I want the maintainer Robert Jarzmik to review this before I do anything

Hi Linus,

I did review, on December the 10th. I wasn't very enthusiastic about the patch,
if you check back my reply.

Cheers.

-- 
Robert


Re: [PATCH v3 2/4] phy: qcom-qusb2: New driver for QUSB2 PHY on Qcom chips

2016-12-28 Thread Vivek Gautam
Hi Stephen,

On Thu, Dec 29, 2016 at 12:27 PM, Vivek Gautam
 wrote:
> On Thu, Dec 29, 2016 at 4:31 AM, Stephen Boyd  wrote:
>> On 12/20, Vivek Gautam wrote:
>>> PHY transceiver driver for QUSB2 phy controller that provides
>>> HighSpeed functionality for DWC3 controller present on
>>> Qualcomm chipsets.
>>>
>>> Signed-off-by: Vivek Gautam 
>>
>> One comment below, but otherwise
>>
>> Reviewed-by: Stephen Boyd 

Thanks for the review.


Best Regards
Vivek

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH v3 2/4] phy: qcom-qusb2: New driver for QUSB2 PHY on Qcom chips

2016-12-28 Thread Vivek Gautam
Hi Stephen,

On Thu, Dec 29, 2016 at 12:27 PM, Vivek Gautam
 wrote:
> On Thu, Dec 29, 2016 at 4:31 AM, Stephen Boyd  wrote:
>> On 12/20, Vivek Gautam wrote:
>>> PHY transceiver driver for QUSB2 phy controller that provides
>>> HighSpeed functionality for DWC3 controller present on
>>> Qualcomm chipsets.
>>>
>>> Signed-off-by: Vivek Gautam 
>>
>> One comment below, but otherwise
>>
>> Reviewed-by: Stephen Boyd 

Thanks for the review.


Best Regards
Vivek

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: mm: fix typo of cache_alloc_zspage()

2016-12-28 Thread Minchan Kim
Hi Sergey,

On Thu, Dec 29, 2016 at 03:52:05PM +0900, Sergey Senozhatsky wrote:
> On (12/29/16 15:44), Minchan Kim wrote:
> > On Thu, Dec 29, 2016 at 10:06:47AM +0800, Xishi Qiu wrote:
> > > Signed-off-by: Xishi Qiu 
> > > ---
> > >  mm/zsmalloc.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > > index 9cc3c0b..2d6c92e 100644
> > > --- a/mm/zsmalloc.c
> > > +++ b/mm/zsmalloc.c
> > > @@ -364,7 +364,7 @@ static struct zspage *cache_alloc_zspage(struct 
> > > zs_pool *pool, gfp_t flags)
> > >  {
> > >   return kmem_cache_alloc(pool->zspage_cachep,
> > >   flags & ~(__GFP_HIGHMEM|__GFP_MOVABLE));
> > > -};
> > > +}
> > 
> > Although it's trivial, we need descritpion.
> > Please, could you resend to Andrew Morton with filling description?
> 
> I don't know... do we want to have it as a separate patch?
> may be we can fold it into some other patch someday later.

Xishi spent his time to make the patch(review,create/send). And I want to
give a credit to him. :)

Thanks.


Re: mm: fix typo of cache_alloc_zspage()

2016-12-28 Thread Minchan Kim
Hi Sergey,

On Thu, Dec 29, 2016 at 03:52:05PM +0900, Sergey Senozhatsky wrote:
> On (12/29/16 15:44), Minchan Kim wrote:
> > On Thu, Dec 29, 2016 at 10:06:47AM +0800, Xishi Qiu wrote:
> > > Signed-off-by: Xishi Qiu 
> > > ---
> > >  mm/zsmalloc.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > > index 9cc3c0b..2d6c92e 100644
> > > --- a/mm/zsmalloc.c
> > > +++ b/mm/zsmalloc.c
> > > @@ -364,7 +364,7 @@ static struct zspage *cache_alloc_zspage(struct 
> > > zs_pool *pool, gfp_t flags)
> > >  {
> > >   return kmem_cache_alloc(pool->zspage_cachep,
> > >   flags & ~(__GFP_HIGHMEM|__GFP_MOVABLE));
> > > -};
> > > +}
> > 
> > Although it's trivial, we need descritpion.
> > Please, could you resend to Andrew Morton with filling description?
> 
> I don't know... do we want to have it as a separate patch?
> may be we can fold it into some other patch someday later.

Xishi spent his time to make the patch(review,create/send). And I want to
give a credit to him. :)

Thanks.


Re: [PATCH v3 2/4] phy: qcom-qusb2: New driver for QUSB2 PHY on Qcom chips

2016-12-28 Thread Vivek Gautam
On Thu, Dec 29, 2016 at 4:31 AM, Stephen Boyd  wrote:
> On 12/20, Vivek Gautam wrote:
>> PHY transceiver driver for QUSB2 phy controller that provides
>> HighSpeed functionality for DWC3 controller present on
>> Qualcomm chipsets.
>>
>> Signed-off-by: Vivek Gautam 
>
> One comment below, but otherwise
>
> Reviewed-by: Stephen Boyd 
>
>> +static void qusb2_phy_set_tune2_param(struct qusb2_phy *qphy)
>> +{
>> + struct device *dev = >phy->dev;
>> + u8 *val;
>> +
>> + /*
>> +  * Read efuse register having TUNE2 parameter's high nibble.
>> +  * If efuse register shows value as 0x0, or if we fail to find
>> +  * a valid efuse register settings, then use default value
>> +  * as 0xB for high nibble that we have already set while
>> +  * configuring phy.
>> +  */
>> + val = nvmem_cell_read(qphy->cell, NULL);
>> + if (IS_ERR(val) || !val[0]) {
>> + dev_dbg(dev, "failed to read a valid hs-tx trim value, %ld\n",
>> + PTR_ERR(val));
>
> If val is 0 PTR_ERR(0) will be junk? I guess that's ok for debug
> print.

May be -EINVAL is better for debug print. Even when val[0]
is 0, val will still be a valid pointer, and so PTR_ERR(val) will
essentially be the pointer casted to long.



Thanks
Vivek
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH v3 2/4] phy: qcom-qusb2: New driver for QUSB2 PHY on Qcom chips

2016-12-28 Thread Vivek Gautam
On Thu, Dec 29, 2016 at 4:31 AM, Stephen Boyd  wrote:
> On 12/20, Vivek Gautam wrote:
>> PHY transceiver driver for QUSB2 phy controller that provides
>> HighSpeed functionality for DWC3 controller present on
>> Qualcomm chipsets.
>>
>> Signed-off-by: Vivek Gautam 
>
> One comment below, but otherwise
>
> Reviewed-by: Stephen Boyd 
>
>> +static void qusb2_phy_set_tune2_param(struct qusb2_phy *qphy)
>> +{
>> + struct device *dev = >phy->dev;
>> + u8 *val;
>> +
>> + /*
>> +  * Read efuse register having TUNE2 parameter's high nibble.
>> +  * If efuse register shows value as 0x0, or if we fail to find
>> +  * a valid efuse register settings, then use default value
>> +  * as 0xB for high nibble that we have already set while
>> +  * configuring phy.
>> +  */
>> + val = nvmem_cell_read(qphy->cell, NULL);
>> + if (IS_ERR(val) || !val[0]) {
>> + dev_dbg(dev, "failed to read a valid hs-tx trim value, %ld\n",
>> + PTR_ERR(val));
>
> If val is 0 PTR_ERR(0) will be junk? I guess that's ok for debug
> print.

May be -EINVAL is better for debug print. Even when val[0]
is 0, val will still be a valid pointer, and so PTR_ERR(val) will
essentially be the pointer casted to long.



Thanks
Vivek
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


[RFC PATCH V2] ext4: increase the protection of drop nlink and ext4 inode destroy

2016-12-28 Thread yi zhang
Because of the disk and hardware issue, the ext4 filesystem have 
many errors, the inode->i_nlink of ext4 becomes zero abnormally 
but the dentry is still positive, it will cause memory corruption 
after the following process:

 1) Due to the inode->i_nlink is 0, this inode will be added into
the orhpan list,
 2) ext4_rename() cover this inode, and drop_nlink() will reverse
the inode->i_nlink to 0x,
 3) iput() add this inode to LRU,
 4) evict() will call destroy_inode() to destroy this inode but
skip removing it from the orphan list,
 5) after this, the inode's memory address space will be used by
other module, when the ext4 filesystem change the orphan list, it will
trample other module's data and then may cause oops.

Although we cannot avoid hardware and disk errors, we can control the
softwore error in the ext4 module, do not affect other modules and
increase the difficulty of locating problems.

This patch avoid inode->i_nlink reverse and remove the inode from the
orphan list when destroy it if the list is not empty.

changes since: v1
 - correct a spelling mistake.
 - change the style of the WARN string.

Signed-off-by: yi zhang 
---
 fs/ext4/super.c | 1 +
 fs/inode.c  | 5 -
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 52b0530..617327e 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -975,6 +975,7 @@ static void ext4_destroy_inode(struct inode *inode)
EXT4_I(inode), sizeof(struct ext4_inode_info),
true);
dump_stack();
+   ext4_orphan_del(NULL, inode);
}
call_rcu(>i_rcu, ext4_i_callback);
 }
diff --git a/fs/inode.c b/fs/inode.c
index 88110fd..079d383 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -279,7 +279,10 @@ static void destroy_inode(struct inode *inode)
  */
 void drop_nlink(struct inode *inode)
 {
-   WARN_ON(inode->i_nlink == 0);
+   if (WARN(inode->i_nlink == 0,
+   "inode %lu nlink is already 0", inode->i_ino))
+   return;
+
inode->__i_nlink--;
if (!inode->i_nlink)
atomic_long_inc(>i_sb->s_remove_count);
-- 
2.5.0



[RFC PATCH V2] ext4: increase the protection of drop nlink and ext4 inode destroy

2016-12-28 Thread yi zhang
Because of the disk and hardware issue, the ext4 filesystem have 
many errors, the inode->i_nlink of ext4 becomes zero abnormally 
but the dentry is still positive, it will cause memory corruption 
after the following process:

 1) Due to the inode->i_nlink is 0, this inode will be added into
the orhpan list,
 2) ext4_rename() cover this inode, and drop_nlink() will reverse
the inode->i_nlink to 0x,
 3) iput() add this inode to LRU,
 4) evict() will call destroy_inode() to destroy this inode but
skip removing it from the orphan list,
 5) after this, the inode's memory address space will be used by
other module, when the ext4 filesystem change the orphan list, it will
trample other module's data and then may cause oops.

Although we cannot avoid hardware and disk errors, we can control the
softwore error in the ext4 module, do not affect other modules and
increase the difficulty of locating problems.

This patch avoid inode->i_nlink reverse and remove the inode from the
orphan list when destroy it if the list is not empty.

changes since: v1
 - correct a spelling mistake.
 - change the style of the WARN string.

Signed-off-by: yi zhang 
---
 fs/ext4/super.c | 1 +
 fs/inode.c  | 5 -
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 52b0530..617327e 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -975,6 +975,7 @@ static void ext4_destroy_inode(struct inode *inode)
EXT4_I(inode), sizeof(struct ext4_inode_info),
true);
dump_stack();
+   ext4_orphan_del(NULL, inode);
}
call_rcu(>i_rcu, ext4_i_callback);
 }
diff --git a/fs/inode.c b/fs/inode.c
index 88110fd..079d383 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -279,7 +279,10 @@ static void destroy_inode(struct inode *inode)
  */
 void drop_nlink(struct inode *inode)
 {
-   WARN_ON(inode->i_nlink == 0);
+   if (WARN(inode->i_nlink == 0,
+   "inode %lu nlink is already 0", inode->i_ino))
+   return;
+
inode->__i_nlink--;
if (!inode->i_nlink)
atomic_long_inc(>i_sb->s_remove_count);
-- 
2.5.0



Re: mm: fix typo of cache_alloc_zspage()

2016-12-28 Thread Sergey Senozhatsky
On (12/29/16 15:44), Minchan Kim wrote:
> On Thu, Dec 29, 2016 at 10:06:47AM +0800, Xishi Qiu wrote:
> > Signed-off-by: Xishi Qiu 
> > ---
> >  mm/zsmalloc.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > index 9cc3c0b..2d6c92e 100644
> > --- a/mm/zsmalloc.c
> > +++ b/mm/zsmalloc.c
> > @@ -364,7 +364,7 @@ static struct zspage *cache_alloc_zspage(struct zs_pool 
> > *pool, gfp_t flags)
> >  {
> > return kmem_cache_alloc(pool->zspage_cachep,
> > flags & ~(__GFP_HIGHMEM|__GFP_MOVABLE));
> > -};
> > +}
> 
> Although it's trivial, we need descritpion.
> Please, could you resend to Andrew Morton with filling description?

I don't know... do we want to have it as a separate patch?
may be we can fold it into some other patch someday later.

-ss


Re: mm: fix typo of cache_alloc_zspage()

2016-12-28 Thread Sergey Senozhatsky
On (12/29/16 15:44), Minchan Kim wrote:
> On Thu, Dec 29, 2016 at 10:06:47AM +0800, Xishi Qiu wrote:
> > Signed-off-by: Xishi Qiu 
> > ---
> >  mm/zsmalloc.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > index 9cc3c0b..2d6c92e 100644
> > --- a/mm/zsmalloc.c
> > +++ b/mm/zsmalloc.c
> > @@ -364,7 +364,7 @@ static struct zspage *cache_alloc_zspage(struct zs_pool 
> > *pool, gfp_t flags)
> >  {
> > return kmem_cache_alloc(pool->zspage_cachep,
> > flags & ~(__GFP_HIGHMEM|__GFP_MOVABLE));
> > -};
> > +}
> 
> Although it's trivial, we need descritpion.
> Please, could you resend to Andrew Morton with filling description?

I don't know... do we want to have it as a separate patch?
may be we can fold it into some other patch someday later.

-ss


Re: mm: fix typo of cache_alloc_zspage()

2016-12-28 Thread Minchan Kim
On Thu, Dec 29, 2016 at 10:06:47AM +0800, Xishi Qiu wrote:
> Signed-off-by: Xishi Qiu 
> ---
>  mm/zsmalloc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index 9cc3c0b..2d6c92e 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -364,7 +364,7 @@ static struct zspage *cache_alloc_zspage(struct zs_pool 
> *pool, gfp_t flags)
>  {
>   return kmem_cache_alloc(pool->zspage_cachep,
>   flags & ~(__GFP_HIGHMEM|__GFP_MOVABLE));
> -};
> +}

Although it's trivial, we need descritpion.
Please, could you resend to Andrew Morton with filling description?

Andrew Morton 

Thanks.


Re: mm: fix typo of cache_alloc_zspage()

2016-12-28 Thread Minchan Kim
On Thu, Dec 29, 2016 at 10:06:47AM +0800, Xishi Qiu wrote:
> Signed-off-by: Xishi Qiu 
> ---
>  mm/zsmalloc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index 9cc3c0b..2d6c92e 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -364,7 +364,7 @@ static struct zspage *cache_alloc_zspage(struct zs_pool 
> *pool, gfp_t flags)
>  {
>   return kmem_cache_alloc(pool->zspage_cachep,
>   flags & ~(__GFP_HIGHMEM|__GFP_MOVABLE));
> -};
> +}

Although it's trivial, we need descritpion.
Please, could you resend to Andrew Morton with filling description?

Andrew Morton 

Thanks.


Re: [PATCH 4/7] mm, vmscan: show LRU name in mm_vmscan_lru_isolate tracepoint

2016-12-28 Thread Minchan Kim
On Wed, Dec 28, 2016 at 04:30:29PM +0100, Michal Hocko wrote:
> From: Michal Hocko 
> 
> mm_vmscan_lru_isolate currently prints only whether the LRU we isolate
> from is file or anonymous but we do not know which LRU this is. It is
> useful to know whether the list is file or anonymous as well. Change
> the tracepoint to show symbolic names of the lru rather.
> 
> Signed-off-by: Michal Hocko 

Not exactly same with this but idea is almost same.
I used almost same tracepoint to investigate agging(i.e., deactivating) problem
in 32b kernel with node-lru.
It was enough. Namely, I didn't need tracepoint in shrink_active_list like your
first patch.
Your first patch is more straightforwad and information. But as you introduced
this patch, I want to ask in here.
Isn't it enough with this patch without your first one to find a such problem?

Thanks.

> ---
>  include/trace/events/vmscan.h | 20 ++--
>  mm/vmscan.c   |  2 +-
>  2 files changed, 15 insertions(+), 7 deletions(-)
> 
> diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
> index 6af4dae46db2..cc0b4c456c78 100644
> --- a/include/trace/events/vmscan.h
> +++ b/include/trace/events/vmscan.h
> @@ -36,6 +36,14 @@
>   (RECLAIM_WB_ASYNC) \
>   )
>  
> +#define show_lru_name(lru) \
> + __print_symbolic(lru, \
> + {LRU_INACTIVE_ANON, "LRU_INACTIVE_ANON"}, \
> + {LRU_ACTIVE_ANON, "LRU_ACTIVE_ANON"}, \
> + {LRU_INACTIVE_FILE, "LRU_INACTIVE_FILE"}, \
> + {LRU_ACTIVE_FILE, "LRU_ACTIVE_FILE"}, \
> + {LRU_UNEVICTABLE, "LRU_UNEVICTABLE"})
> +
>  TRACE_EVENT(mm_vmscan_kswapd_sleep,
>  
>   TP_PROTO(int nid),
> @@ -277,9 +285,9 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
>   unsigned long nr_skipped,
>   unsigned long nr_taken,
>   isolate_mode_t isolate_mode,
> - int file),
> + int lru),
>  
> - TP_ARGS(classzone_idx, order, nr_requested, nr_scanned, nr_skipped, 
> nr_taken, isolate_mode, file),
> + TP_ARGS(classzone_idx, order, nr_requested, nr_scanned, nr_skipped, 
> nr_taken, isolate_mode, lru),
>  
>   TP_STRUCT__entry(
>   __field(int, classzone_idx)
> @@ -289,7 +297,7 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
>   __field(unsigned long, nr_skipped)
>   __field(unsigned long, nr_taken)
>   __field(isolate_mode_t, isolate_mode)
> - __field(int, file)
> + __field(int, lru)
>   ),
>  
>   TP_fast_assign(
> @@ -300,10 +308,10 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
>   __entry->nr_skipped = nr_skipped;
>   __entry->nr_taken = nr_taken;
>   __entry->isolate_mode = isolate_mode;
> - __entry->file = file;
> + __entry->lru = lru;
>   ),
>  
> - TP_printk("isolate_mode=%d classzone=%d order=%d nr_requested=%lu 
> nr_scanned=%lu nr_skipped=%lu nr_taken=%lu file=%d",
> + TP_printk("isolate_mode=%d classzone=%d order=%d nr_requested=%lu 
> nr_scanned=%lu nr_skipped=%lu nr_taken=%lu lru=%s",
>   __entry->isolate_mode,
>   __entry->classzone_idx,
>   __entry->order,
> @@ -311,7 +319,7 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
>   __entry->nr_scanned,
>   __entry->nr_skipped,
>   __entry->nr_taken,
> - __entry->file)
> + show_lru_name(__entry->lru))
>  );
>  
>  TRACE_EVENT(mm_vmscan_writepage,
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 4f7c0d66d629..3f0774f30a42 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1500,7 +1500,7 @@ static unsigned long isolate_lru_pages(unsigned long 
> nr_to_scan,
>   }
>   *nr_scanned = scan + total_skipped;
>   trace_mm_vmscan_lru_isolate(sc->reclaim_idx, sc->order, nr_to_scan, 
> scan,
> - skipped, nr_taken, mode, is_file_lru(lru));
> + skipped, nr_taken, mode, lru);
>   update_lru_sizes(lruvec, lru, nr_zone_taken, nr_taken);
>   return nr_taken;
>  }
> -- 
> 2.10.2
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 


Re: [PATCH 4/7] mm, vmscan: show LRU name in mm_vmscan_lru_isolate tracepoint

2016-12-28 Thread Minchan Kim
On Wed, Dec 28, 2016 at 04:30:29PM +0100, Michal Hocko wrote:
> From: Michal Hocko 
> 
> mm_vmscan_lru_isolate currently prints only whether the LRU we isolate
> from is file or anonymous but we do not know which LRU this is. It is
> useful to know whether the list is file or anonymous as well. Change
> the tracepoint to show symbolic names of the lru rather.
> 
> Signed-off-by: Michal Hocko 

Not exactly same with this but idea is almost same.
I used almost same tracepoint to investigate agging(i.e., deactivating) problem
in 32b kernel with node-lru.
It was enough. Namely, I didn't need tracepoint in shrink_active_list like your
first patch.
Your first patch is more straightforwad and information. But as you introduced
this patch, I want to ask in here.
Isn't it enough with this patch without your first one to find a such problem?

Thanks.

> ---
>  include/trace/events/vmscan.h | 20 ++--
>  mm/vmscan.c   |  2 +-
>  2 files changed, 15 insertions(+), 7 deletions(-)
> 
> diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
> index 6af4dae46db2..cc0b4c456c78 100644
> --- a/include/trace/events/vmscan.h
> +++ b/include/trace/events/vmscan.h
> @@ -36,6 +36,14 @@
>   (RECLAIM_WB_ASYNC) \
>   )
>  
> +#define show_lru_name(lru) \
> + __print_symbolic(lru, \
> + {LRU_INACTIVE_ANON, "LRU_INACTIVE_ANON"}, \
> + {LRU_ACTIVE_ANON, "LRU_ACTIVE_ANON"}, \
> + {LRU_INACTIVE_FILE, "LRU_INACTIVE_FILE"}, \
> + {LRU_ACTIVE_FILE, "LRU_ACTIVE_FILE"}, \
> + {LRU_UNEVICTABLE, "LRU_UNEVICTABLE"})
> +
>  TRACE_EVENT(mm_vmscan_kswapd_sleep,
>  
>   TP_PROTO(int nid),
> @@ -277,9 +285,9 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
>   unsigned long nr_skipped,
>   unsigned long nr_taken,
>   isolate_mode_t isolate_mode,
> - int file),
> + int lru),
>  
> - TP_ARGS(classzone_idx, order, nr_requested, nr_scanned, nr_skipped, 
> nr_taken, isolate_mode, file),
> + TP_ARGS(classzone_idx, order, nr_requested, nr_scanned, nr_skipped, 
> nr_taken, isolate_mode, lru),
>  
>   TP_STRUCT__entry(
>   __field(int, classzone_idx)
> @@ -289,7 +297,7 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
>   __field(unsigned long, nr_skipped)
>   __field(unsigned long, nr_taken)
>   __field(isolate_mode_t, isolate_mode)
> - __field(int, file)
> + __field(int, lru)
>   ),
>  
>   TP_fast_assign(
> @@ -300,10 +308,10 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
>   __entry->nr_skipped = nr_skipped;
>   __entry->nr_taken = nr_taken;
>   __entry->isolate_mode = isolate_mode;
> - __entry->file = file;
> + __entry->lru = lru;
>   ),
>  
> - TP_printk("isolate_mode=%d classzone=%d order=%d nr_requested=%lu 
> nr_scanned=%lu nr_skipped=%lu nr_taken=%lu file=%d",
> + TP_printk("isolate_mode=%d classzone=%d order=%d nr_requested=%lu 
> nr_scanned=%lu nr_skipped=%lu nr_taken=%lu lru=%s",
>   __entry->isolate_mode,
>   __entry->classzone_idx,
>   __entry->order,
> @@ -311,7 +319,7 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
>   __entry->nr_scanned,
>   __entry->nr_skipped,
>   __entry->nr_taken,
> - __entry->file)
> + show_lru_name(__entry->lru))
>  );
>  
>  TRACE_EVENT(mm_vmscan_writepage,
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 4f7c0d66d629..3f0774f30a42 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1500,7 +1500,7 @@ static unsigned long isolate_lru_pages(unsigned long 
> nr_to_scan,
>   }
>   *nr_scanned = scan + total_skipped;
>   trace_mm_vmscan_lru_isolate(sc->reclaim_idx, sc->order, nr_to_scan, 
> scan,
> - skipped, nr_taken, mode, is_file_lru(lru));
> + skipped, nr_taken, mode, lru);
>   update_lru_sizes(lruvec, lru, nr_zone_taken, nr_taken);
>   return nr_taken;
>  }
> -- 
> 2.10.2
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 


Re: [PATCH 3/7] mm, vmscan: show the number of skipped pages in mm_vmscan_lru_isolate

2016-12-28 Thread Minchan Kim
On Wed, Dec 28, 2016 at 04:30:28PM +0100, Michal Hocko wrote:
> From: Michal Hocko 
> 
> mm_vmscan_lru_isolate shows the number of requested, scanned and taken
> pages. This is mostly OK but on 32b systems the number of scanned pages
> is quite misleading because it includes both the scanned and skipped
> pages.  Moreover the skipped part is scaled based on the number of taken
> pages. Let's report the exact numbers without any additional logic and
> add the number of skipped pages. This should make the reported data much
> more easier to interpret.
> 
> Signed-off-by: Michal Hocko 
Acked-by: Minchan Kim 



Re: [PATCH 3/7] mm, vmscan: show the number of skipped pages in mm_vmscan_lru_isolate

2016-12-28 Thread Minchan Kim
On Wed, Dec 28, 2016 at 04:30:28PM +0100, Michal Hocko wrote:
> From: Michal Hocko 
> 
> mm_vmscan_lru_isolate shows the number of requested, scanned and taken
> pages. This is mostly OK but on 32b systems the number of scanned pages
> is quite misleading because it includes both the scanned and skipped
> pages.  Moreover the skipped part is scaled based on the number of taken
> pages. Let's report the exact numbers without any additional logic and
> add the number of skipped pages. This should make the reported data much
> more easier to interpret.
> 
> Signed-off-by: Michal Hocko 
Acked-by: Minchan Kim 



Re: [PATCH] pci: rename *host* directory to *controller*

2016-12-28 Thread Kishon Vijay Abraham I
Hi,

On Wednesday 28 December 2016 10:50 PM, Joao Pinto wrote:
> Às 5:17 PM de 12/28/2016, Joao Pinto escreveu:
>> Às 4:41 PM de 12/28/2016, Bjorn Helgaas escreveu:
>>> On Wed, Dec 28, 2016 at 01:57:13PM +, Joao Pinto wrote:
 Às 9:22 AM de 12/28/2016, Christoph Hellwig escreveu:
> On Wed, Dec 28, 2016 at 01:39:37PM +0530, Kishon Vijay Abraham I wrote:
>> As discussed during our LPC discussions, I'm posting the rename patch
>> here. I'll post the rest of EP series before the next merge window.
>>
>> There might be hiccups because of this renaming but feel this is
>> necessary for long-term maintenance.
>
> if we do this rename it would be great to get it to Linus NOW after
> -rc1 as that minimizes the impact on the 4.11 merge window.

 Rename it to controller is a bit vague I thing since we have the PCI 
 Endpoint IP
 also. Wouldn't be better to name it rc_controller?
>>>
>>> I think Kishon's whole goal is to add PCI Endpoint IP, so he wants a
>>> neutral name that can cover both RC and Endpoint.

right.
>>>
>>> I'm not a huge fan of "controller" because it feels a little bit long
>>> and while I suppose it technically does include the concept of the PCI
>>> interface of an endpoint, it still suggests more of the host side to
>>> me.
>>>
>>> Doesn't USB have a similar situation?  I see there's a
>>> drivers/usb/host/ (probably where we copied from in the first place).
>>> Is a USB gadget the USB analog of what you're doing?  How do they
>>> share code between the master/slave sides?
>>>
>>
>> The usb/host contains the implemnentations by the spec of the several
>> *hci (USB Host) and then you can have for example the USB 3.0 Designware
>> Host specific ops in dwc3 and for USB 2.0 in dwc2/.

right, each IP have a separate directory in USB. I thought of doing something
similar for PCI but decided against it since that would involve identifying all
the PCI IPs used and eventually result in more directories.
>> For device purposes it uses the core/ and then some of the device functions
>> are extended from the gadget/ package in which you can use mass_storage and
>> other types of functions.

That would be similar for PCI endpoint. All endpoint specific core
functionality will be added in drivers/pci/endpoint (see RFC [1]).
>>
>> In our case in PCI we have the core functions inside /drivers/pci and the 
>> host
>> mangled inside host. I suggest:
>>
>> drivers/pci
>>  drivers/pci/core/
>>  drivers/pci/core/hotplug
>>  drivers/pci/core/pcie
>>  drivers/pci/core/
>>  drivers/pci/host
>>  drivers/pci/dwc -> here would be pcie-designware and the specific vendor 
>> drivers
> 
> Correction:
> drivers/pci/host/dwc -> here would be pcie-designware and the specific vendor
> drivers
> 
>>  drivers/pci/ -> here would be the drivers for vendorN controller
> 
> Correction:
> drivers/pci/host/ -> here would be the drivers for vendorN controller
> 
>>  drivers/pci/endpoint -> common code
>>  drivers/pci/endpoint/dwc -> implementation of Synopsys specific endpoint ops
>>  drivers/pci/ -> implementation of other vendors specific endpoint 
>> ops

There are some parts of the dwc driver that is common to both root complex and
endpoint. Where should that be? I'm sure no one wants to duplicate the common
piece in both root complex and endpoint.

[1] -> https://lkml.org/lkml/2016/9/14/27

Thanks
Kishon


Re: [PATCH] pci: rename *host* directory to *controller*

2016-12-28 Thread Kishon Vijay Abraham I
Hi,

On Wednesday 28 December 2016 10:50 PM, Joao Pinto wrote:
> Às 5:17 PM de 12/28/2016, Joao Pinto escreveu:
>> Às 4:41 PM de 12/28/2016, Bjorn Helgaas escreveu:
>>> On Wed, Dec 28, 2016 at 01:57:13PM +, Joao Pinto wrote:
 Às 9:22 AM de 12/28/2016, Christoph Hellwig escreveu:
> On Wed, Dec 28, 2016 at 01:39:37PM +0530, Kishon Vijay Abraham I wrote:
>> As discussed during our LPC discussions, I'm posting the rename patch
>> here. I'll post the rest of EP series before the next merge window.
>>
>> There might be hiccups because of this renaming but feel this is
>> necessary for long-term maintenance.
>
> if we do this rename it would be great to get it to Linus NOW after
> -rc1 as that minimizes the impact on the 4.11 merge window.

 Rename it to controller is a bit vague I thing since we have the PCI 
 Endpoint IP
 also. Wouldn't be better to name it rc_controller?
>>>
>>> I think Kishon's whole goal is to add PCI Endpoint IP, so he wants a
>>> neutral name that can cover both RC and Endpoint.

right.
>>>
>>> I'm not a huge fan of "controller" because it feels a little bit long
>>> and while I suppose it technically does include the concept of the PCI
>>> interface of an endpoint, it still suggests more of the host side to
>>> me.
>>>
>>> Doesn't USB have a similar situation?  I see there's a
>>> drivers/usb/host/ (probably where we copied from in the first place).
>>> Is a USB gadget the USB analog of what you're doing?  How do they
>>> share code between the master/slave sides?
>>>
>>
>> The usb/host contains the implemnentations by the spec of the several
>> *hci (USB Host) and then you can have for example the USB 3.0 Designware
>> Host specific ops in dwc3 and for USB 2.0 in dwc2/.

right, each IP have a separate directory in USB. I thought of doing something
similar for PCI but decided against it since that would involve identifying all
the PCI IPs used and eventually result in more directories.
>> For device purposes it uses the core/ and then some of the device functions
>> are extended from the gadget/ package in which you can use mass_storage and
>> other types of functions.

That would be similar for PCI endpoint. All endpoint specific core
functionality will be added in drivers/pci/endpoint (see RFC [1]).
>>
>> In our case in PCI we have the core functions inside /drivers/pci and the 
>> host
>> mangled inside host. I suggest:
>>
>> drivers/pci
>>  drivers/pci/core/
>>  drivers/pci/core/hotplug
>>  drivers/pci/core/pcie
>>  drivers/pci/core/
>>  drivers/pci/host
>>  drivers/pci/dwc -> here would be pcie-designware and the specific vendor 
>> drivers
> 
> Correction:
> drivers/pci/host/dwc -> here would be pcie-designware and the specific vendor
> drivers
> 
>>  drivers/pci/ -> here would be the drivers for vendorN controller
> 
> Correction:
> drivers/pci/host/ -> here would be the drivers for vendorN controller
> 
>>  drivers/pci/endpoint -> common code
>>  drivers/pci/endpoint/dwc -> implementation of Synopsys specific endpoint ops
>>  drivers/pci/ -> implementation of other vendors specific endpoint 
>> ops

There are some parts of the dwc driver that is common to both root complex and
endpoint. Where should that be? I'm sure no one wants to duplicate the common
piece in both root complex and endpoint.

[1] -> https://lkml.org/lkml/2016/9/14/27

Thanks
Kishon


Re: [PATCH v6] soc: qcom: Add SoC info driver

2016-12-28 Thread Imran Khan
On 12/29/2016 4:05 AM, Stephen Boyd wrote:
> On 12/23, Imran Khan wrote:
>> On 12/22/2016 6:01 AM, Stephen Boyd wrote:
>>>
>>> Raw numbers sounds fine, but how do we know what ODM it is to
>>> understand how to parse the numbers appropriately? Perhaps the
>>> smem DT entry needs to have a property indicating the ODM that
>>> has configured these numbers, and then we can have an ODM sysfs
>>> node that we use to expose that string property to userspace?
>>>
>> Okay smem DT entry can be used to provide ODM information but even after 
>> having this feature, I am not sure if we can provide a code in the driver
>> that will act for all ODMs because we don't know how other ODMs will 
>> interpret
>> platform types and subtypes numbers.
>> Or do you mean here that we should keep string values corresponding to 
>> different
>> platform type and subtype numbers in the smem DT entry itself. We will use
>> socinfo from smem to get the raw number and then translate that raw number 
>> to 
>> a string, using the mapping given in DT itself.
>>
> 
> I mean in DT
> 
>   smem {
>   compatible = "qcom,smem";
>   qcom,odm = "odm_name";
>   }
> 
> And then in the driver code we look for the qcom,odm property and
> make a sysfs attribute called odm or something that exposes the
> string "odm_name" to userspace. Then we have some userspace
> database of odm string and platform type/subtype numbers that we
> can use to figure out what those numbers mean.
> 
Okay. This approach is fine for me. In the mean time I had tried
one alternative approach which I wanted to share. So in the smem DT 
entry we have something like:

 smem {
compatible = "qcom,smem";
..
..
smem,plat-type = "Unknown", "Surf", "FFA", "Fluid",
"SVLTE_FFA", "SVLTE_SURF", "Unknown",
"MDM_MTP_NO_DISPLAY", "MTP", "Liquid",
"Dragon", "QRD", "Unknown","HRD", "DTV";
smem,qrd-plat-subtype = "QRD", "SKUAA", "SKUF", "SKUAB",
"SKUG";
smem,plat-subtype = "Unknown", "charm", "strange",
"strange_2a";
};

And back in the qcom_soc_init, we read these lists as per the value(index) 
obtained
from smem:

 if(socinfo->v0_1.format >= 6) {
/* Get platform type and subtype here */
type = socinfo->v0_6.hw_platform_subtype;
if(type >= 0 && plat_type) {
if (socinfo->v0_3.hw_platform == HW_PLATFORM_QRD) {
type_max = of_property_count_strings(
device->of_node,
"smem,qrd-plat-subtype");
if(type < type_max) {
of_property_read_string_index(
device->of_node,
"smem,qrd-plat-subtype",
type, _type->sub_type);
}
} else {
type_max = of_property_count_strings(
device->of_node,
"smem,plat-subtype");
if(type < type_max) {
of_property_read_string_index(
device->of_node,
"smem,plat-subtype",
type, _type->sub_type);
}
}
}
}

if (socinfo->v0_1.format >= 3) {
/* Get only platform type */
type = socinfo->v0_3.hw_platform;
type_max = of_property_count_strings(
device->of_node, "smem,plat-type");
if((type >= 0 && type < type_max) && plat_type) {
of_property_read_string_index(device->of_node,
"smem,plat-type", type,
_type->type);
}
}


Could you please also provide your feedback about this approach? Just wanted to 
share
this before going ahead with final implementation.

Regards,
Imran


-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a\nmember of 
the Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH v6] soc: qcom: Add SoC info driver

2016-12-28 Thread Imran Khan
On 12/29/2016 4:05 AM, Stephen Boyd wrote:
> On 12/23, Imran Khan wrote:
>> On 12/22/2016 6:01 AM, Stephen Boyd wrote:
>>>
>>> Raw numbers sounds fine, but how do we know what ODM it is to
>>> understand how to parse the numbers appropriately? Perhaps the
>>> smem DT entry needs to have a property indicating the ODM that
>>> has configured these numbers, and then we can have an ODM sysfs
>>> node that we use to expose that string property to userspace?
>>>
>> Okay smem DT entry can be used to provide ODM information but even after 
>> having this feature, I am not sure if we can provide a code in the driver
>> that will act for all ODMs because we don't know how other ODMs will 
>> interpret
>> platform types and subtypes numbers.
>> Or do you mean here that we should keep string values corresponding to 
>> different
>> platform type and subtype numbers in the smem DT entry itself. We will use
>> socinfo from smem to get the raw number and then translate that raw number 
>> to 
>> a string, using the mapping given in DT itself.
>>
> 
> I mean in DT
> 
>   smem {
>   compatible = "qcom,smem";
>   qcom,odm = "odm_name";
>   }
> 
> And then in the driver code we look for the qcom,odm property and
> make a sysfs attribute called odm or something that exposes the
> string "odm_name" to userspace. Then we have some userspace
> database of odm string and platform type/subtype numbers that we
> can use to figure out what those numbers mean.
> 
Okay. This approach is fine for me. In the mean time I had tried
one alternative approach which I wanted to share. So in the smem DT 
entry we have something like:

 smem {
compatible = "qcom,smem";
..
..
smem,plat-type = "Unknown", "Surf", "FFA", "Fluid",
"SVLTE_FFA", "SVLTE_SURF", "Unknown",
"MDM_MTP_NO_DISPLAY", "MTP", "Liquid",
"Dragon", "QRD", "Unknown","HRD", "DTV";
smem,qrd-plat-subtype = "QRD", "SKUAA", "SKUF", "SKUAB",
"SKUG";
smem,plat-subtype = "Unknown", "charm", "strange",
"strange_2a";
};

And back in the qcom_soc_init, we read these lists as per the value(index) 
obtained
from smem:

 if(socinfo->v0_1.format >= 6) {
/* Get platform type and subtype here */
type = socinfo->v0_6.hw_platform_subtype;
if(type >= 0 && plat_type) {
if (socinfo->v0_3.hw_platform == HW_PLATFORM_QRD) {
type_max = of_property_count_strings(
device->of_node,
"smem,qrd-plat-subtype");
if(type < type_max) {
of_property_read_string_index(
device->of_node,
"smem,qrd-plat-subtype",
type, _type->sub_type);
}
} else {
type_max = of_property_count_strings(
device->of_node,
"smem,plat-subtype");
if(type < type_max) {
of_property_read_string_index(
device->of_node,
"smem,plat-subtype",
type, _type->sub_type);
}
}
}
}

if (socinfo->v0_1.format >= 3) {
/* Get only platform type */
type = socinfo->v0_3.hw_platform;
type_max = of_property_count_strings(
device->of_node, "smem,plat-type");
if((type >= 0 && type < type_max) && plat_type) {
of_property_read_string_index(device->of_node,
"smem,plat-type", type,
_type->type);
}
}


Could you please also provide your feedback about this approach? Just wanted to 
share
this before going ahead with final implementation.

Regards,
Imran


-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a\nmember of 
the Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint

2016-12-28 Thread Minchan Kim
On Wed, Dec 28, 2016 at 04:30:27PM +0100, Michal Hocko wrote:
> From: Michal Hocko 
> 
> Our reclaim process has several tracepoints to tell us more about how
> things are progressing. We are, however, missing a tracepoint to track
> active list aging. Introduce mm_vmscan_lru_shrink_active which reports
> the number of scanned, rotated, deactivated and freed pages from the
> particular node's active list.
> 
> Signed-off-by: Michal Hocko 
> ---
>  include/linux/gfp.h   |  2 +-
>  include/trace/events/vmscan.h | 38 ++
>  mm/page_alloc.c   |  6 +-
>  mm/vmscan.c   | 22 +-
>  4 files changed, 61 insertions(+), 7 deletions(-)
> 
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index 4175dca4ac39..61aa9b49e86d 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -503,7 +503,7 @@ void * __meminit alloc_pages_exact_nid(int nid, size_t 
> size, gfp_t gfp_mask);
>  extern void __free_pages(struct page *page, unsigned int order);
>  extern void free_pages(unsigned long addr, unsigned int order);
>  extern void free_hot_cold_page(struct page *page, bool cold);
> -extern void free_hot_cold_page_list(struct list_head *list, bool cold);
> +extern int free_hot_cold_page_list(struct list_head *list, bool cold);
>  
>  struct page_frag_cache;
>  extern void __page_frag_drain(struct page *page, unsigned int order,
> diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
> index 39bad8921ca1..d34cc0ced2be 100644
> --- a/include/trace/events/vmscan.h
> +++ b/include/trace/events/vmscan.h
> @@ -363,6 +363,44 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
>   show_reclaim_flags(__entry->reclaim_flags))
>  );
>  
> +TRACE_EVENT(mm_vmscan_lru_shrink_active,
> +
> + TP_PROTO(int nid, unsigned long nr_scanned, unsigned long nr_freed,
> + unsigned long nr_unevictable, unsigned long nr_deactivated,
> + unsigned long nr_rotated, int priority, int file),
> +
> + TP_ARGS(nid, nr_scanned, nr_freed, nr_unevictable, nr_deactivated, 
> nr_rotated, priority, file),

I agree it is helpful. And it was when I investigated aging problem of 32bit
when node-lru was introduced. However, the question is we really need all those
kinds of information? just enough with nr_taken, nr_deactivated, priority, file?

Also, look at minor thing below.

Thanks.

> +
> + TP_STRUCT__entry(
> + __field(int, nid)
> + __field(unsigned long, nr_scanned)
> + __field(unsigned long, nr_freed)
> + __field(unsigned long, nr_unevictable)
> + __field(unsigned long, nr_deactivated)
> + __field(unsigned long, nr_rotated)
> + __field(int, priority)
> + __field(int, reclaim_flags)
> + ),
> +
> + TP_fast_assign(
> + __entry->nid = nid;
> + __entry->nr_scanned = nr_scanned;
> + __entry->nr_freed = nr_freed;
> + __entry->nr_unevictable = nr_unevictable;
> + __entry->nr_deactivated = nr_deactivated;
> + __entry->nr_rotated = nr_rotated;
> + __entry->priority = priority;
> + __entry->reclaim_flags = trace_shrink_flags(file);
> + ),
> +
> + TP_printk("nid=%d nr_scanned=%ld nr_freed=%ld nr_unevictable=%ld 
> nr_deactivated=%ld nr_rotated=%ld priority=%d flags=%s",
> + __entry->nid,
> + __entry->nr_scanned, __entry->nr_freed, __entry->nr_unevictable,
> + __entry->nr_deactivated, __entry->nr_rotated,
> + __entry->priority,
> + show_reclaim_flags(__entry->reclaim_flags))
> +);
> +
>  #endif /* _TRACE_VMSCAN_H */
>  
>  /* This part must be outside protection */
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 1c24112308d6..77d204660857 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2487,14 +2487,18 @@ void free_hot_cold_page(struct page *page, bool cold)
>  /*
>   * Free a list of 0-order pages
>   */
> -void free_hot_cold_page_list(struct list_head *list, bool cold)
> +int free_hot_cold_page_list(struct list_head *list, bool cold)
>  {
>   struct page *page, *next;
> + int ret = 0;
>  
>   list_for_each_entry_safe(page, next, list, lru) {
>   trace_mm_page_free_batched(page, cold);
>   free_hot_cold_page(page, cold);
> + ret++;
>   }
> +
> + return ret;
>  }
>  
>  /*
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index c4abf08861d2..2302a1a58c6e 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1846,9 +1846,11 @@ shrink_inactive_list(unsigned long nr_to_scan, struct 
> lruvec *lruvec,
>   *
>   * The downside is that we have to touch page->_refcount against each page.
>   * But we had to alter page->flags anyway.
> + *
> + * Returns the number of pages moved to the given lru.
>   */
>  
> -static void 

Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint

2016-12-28 Thread Minchan Kim
On Wed, Dec 28, 2016 at 04:30:27PM +0100, Michal Hocko wrote:
> From: Michal Hocko 
> 
> Our reclaim process has several tracepoints to tell us more about how
> things are progressing. We are, however, missing a tracepoint to track
> active list aging. Introduce mm_vmscan_lru_shrink_active which reports
> the number of scanned, rotated, deactivated and freed pages from the
> particular node's active list.
> 
> Signed-off-by: Michal Hocko 
> ---
>  include/linux/gfp.h   |  2 +-
>  include/trace/events/vmscan.h | 38 ++
>  mm/page_alloc.c   |  6 +-
>  mm/vmscan.c   | 22 +-
>  4 files changed, 61 insertions(+), 7 deletions(-)
> 
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index 4175dca4ac39..61aa9b49e86d 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -503,7 +503,7 @@ void * __meminit alloc_pages_exact_nid(int nid, size_t 
> size, gfp_t gfp_mask);
>  extern void __free_pages(struct page *page, unsigned int order);
>  extern void free_pages(unsigned long addr, unsigned int order);
>  extern void free_hot_cold_page(struct page *page, bool cold);
> -extern void free_hot_cold_page_list(struct list_head *list, bool cold);
> +extern int free_hot_cold_page_list(struct list_head *list, bool cold);
>  
>  struct page_frag_cache;
>  extern void __page_frag_drain(struct page *page, unsigned int order,
> diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
> index 39bad8921ca1..d34cc0ced2be 100644
> --- a/include/trace/events/vmscan.h
> +++ b/include/trace/events/vmscan.h
> @@ -363,6 +363,44 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
>   show_reclaim_flags(__entry->reclaim_flags))
>  );
>  
> +TRACE_EVENT(mm_vmscan_lru_shrink_active,
> +
> + TP_PROTO(int nid, unsigned long nr_scanned, unsigned long nr_freed,
> + unsigned long nr_unevictable, unsigned long nr_deactivated,
> + unsigned long nr_rotated, int priority, int file),
> +
> + TP_ARGS(nid, nr_scanned, nr_freed, nr_unevictable, nr_deactivated, 
> nr_rotated, priority, file),

I agree it is helpful. And it was when I investigated aging problem of 32bit
when node-lru was introduced. However, the question is we really need all those
kinds of information? just enough with nr_taken, nr_deactivated, priority, file?

Also, look at minor thing below.

Thanks.

> +
> + TP_STRUCT__entry(
> + __field(int, nid)
> + __field(unsigned long, nr_scanned)
> + __field(unsigned long, nr_freed)
> + __field(unsigned long, nr_unevictable)
> + __field(unsigned long, nr_deactivated)
> + __field(unsigned long, nr_rotated)
> + __field(int, priority)
> + __field(int, reclaim_flags)
> + ),
> +
> + TP_fast_assign(
> + __entry->nid = nid;
> + __entry->nr_scanned = nr_scanned;
> + __entry->nr_freed = nr_freed;
> + __entry->nr_unevictable = nr_unevictable;
> + __entry->nr_deactivated = nr_deactivated;
> + __entry->nr_rotated = nr_rotated;
> + __entry->priority = priority;
> + __entry->reclaim_flags = trace_shrink_flags(file);
> + ),
> +
> + TP_printk("nid=%d nr_scanned=%ld nr_freed=%ld nr_unevictable=%ld 
> nr_deactivated=%ld nr_rotated=%ld priority=%d flags=%s",
> + __entry->nid,
> + __entry->nr_scanned, __entry->nr_freed, __entry->nr_unevictable,
> + __entry->nr_deactivated, __entry->nr_rotated,
> + __entry->priority,
> + show_reclaim_flags(__entry->reclaim_flags))
> +);
> +
>  #endif /* _TRACE_VMSCAN_H */
>  
>  /* This part must be outside protection */
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 1c24112308d6..77d204660857 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2487,14 +2487,18 @@ void free_hot_cold_page(struct page *page, bool cold)
>  /*
>   * Free a list of 0-order pages
>   */
> -void free_hot_cold_page_list(struct list_head *list, bool cold)
> +int free_hot_cold_page_list(struct list_head *list, bool cold)
>  {
>   struct page *page, *next;
> + int ret = 0;
>  
>   list_for_each_entry_safe(page, next, list, lru) {
>   trace_mm_page_free_batched(page, cold);
>   free_hot_cold_page(page, cold);
> + ret++;
>   }
> +
> + return ret;
>  }
>  
>  /*
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index c4abf08861d2..2302a1a58c6e 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1846,9 +1846,11 @@ shrink_inactive_list(unsigned long nr_to_scan, struct 
> lruvec *lruvec,
>   *
>   * The downside is that we have to touch page->_refcount against each page.
>   * But we had to alter page->flags anyway.
> + *
> + * Returns the number of pages moved to the given lru.
>   */
>  
> -static void move_active_pages_to_lru(struct lruvec *lruvec,
> +static int 

Re: [PATCH] ib umem: bugfix: mixed put_pid()s in ib_umem_get()

2016-12-28 Thread Leon Romanovsky
On Thu, Dec 29, 2016 at 10:24:43AM +0800, Kenneth Lee wrote:
> There are two bugfixes in this patch:
>
> 1. When the execution go to the ib_umem_odp_get() path, pid should be put
>back.
> 2. When the memory allocation fail, the pid also should be put back before
>exit.
>
> Signed-off-by: Kenneth Lee 

Hi Kenneth,

Thank you for resubmitting it. This fix is important and no doubts that
it will be accepted, however you need to improve the patch a little bit
more.

CAn you please resubmit it according to Documentation/SubmittingPatches
and reviewers feedback?

Haggai's Reviewed-by tag, Fixes, changelog, version in title and
proper title (see it in git log for this subsystems).

Thanks


signature.asc
Description: PGP signature


Re: [PATCH] ib umem: bugfix: mixed put_pid()s in ib_umem_get()

2016-12-28 Thread Leon Romanovsky
On Thu, Dec 29, 2016 at 10:24:43AM +0800, Kenneth Lee wrote:
> There are two bugfixes in this patch:
>
> 1. When the execution go to the ib_umem_odp_get() path, pid should be put
>back.
> 2. When the memory allocation fail, the pid also should be put back before
>exit.
>
> Signed-off-by: Kenneth Lee 

Hi Kenneth,

Thank you for resubmitting it. This fix is important and no doubts that
it will be accepted, however you need to improve the patch a little bit
more.

CAn you please resubmit it according to Documentation/SubmittingPatches
and reviewers feedback?

Haggai's Reviewed-by tag, Fixes, changelog, version in title and
proper title (see it in git log for this subsystems).

Thanks


signature.asc
Description: PGP signature


Re: [PATCH 2/2] mm: add PageWaiters indicating tasks are waiting for a page bit

2016-12-28 Thread Nicholas Piggin
On Wed, 28 Dec 2016 20:16:56 -0800
Linus Torvalds  wrote:

> On Wed, Dec 28, 2016 at 8:08 PM, Nicholas Piggin  wrote:
> >
> > Okay. The name could be a bit better though I think, for readability.
> > Just a BUILD_BUG_ON if it is not constant and correct bit numbers?  
> 
> I have a slightly edited patch - moved the comments around and added
> some new comments (about both the sign bit, but also about how the
> smp_mb() shouldn't be necessary even for the non-atomic fallback).

That's a good point -- they're in the same byte, so all architectures
will be able to avoid the extra barrier regardless of how the
primitives are implemented. Good.

> 
> I also did a BUILD_BUG_ON(), except the other way around - keeping it
> about the sign bit in the byte, just just verifying that yes,
> PG_waiters is that sign bit.

Yep. I still don't like the name, but now you've got PG_waiters
commented there at least. I'll have to live with it.

If we get more cases that want to use a similar function, we might make
a more general primitive for architectures that can optimize these multi
bit ops better than x86. Actually even x86 would prefer to do load ;
cmpxchg rather than bitop ; load for the cases where condition code can't
be used to check result.

> 
> > BTW. I just notice in your patch too that you didn't use "nr" in the
> > generic version.  
> 
> And I fixed that too.
> 
> Of course, I didn't test the changes (apart from building it). But
> I've been running the previous version since yesterday, so far no
> issues.

It looks good to me.

Thanks,
Nick



Re: [PATCH 2/2] mm: add PageWaiters indicating tasks are waiting for a page bit

2016-12-28 Thread Nicholas Piggin
On Wed, 28 Dec 2016 20:16:56 -0800
Linus Torvalds  wrote:

> On Wed, Dec 28, 2016 at 8:08 PM, Nicholas Piggin  wrote:
> >
> > Okay. The name could be a bit better though I think, for readability.
> > Just a BUILD_BUG_ON if it is not constant and correct bit numbers?  
> 
> I have a slightly edited patch - moved the comments around and added
> some new comments (about both the sign bit, but also about how the
> smp_mb() shouldn't be necessary even for the non-atomic fallback).

That's a good point -- they're in the same byte, so all architectures
will be able to avoid the extra barrier regardless of how the
primitives are implemented. Good.

> 
> I also did a BUILD_BUG_ON(), except the other way around - keeping it
> about the sign bit in the byte, just just verifying that yes,
> PG_waiters is that sign bit.

Yep. I still don't like the name, but now you've got PG_waiters
commented there at least. I'll have to live with it.

If we get more cases that want to use a similar function, we might make
a more general primitive for architectures that can optimize these multi
bit ops better than x86. Actually even x86 would prefer to do load ;
cmpxchg rather than bitop ; load for the cases where condition code can't
be used to check result.

> 
> > BTW. I just notice in your patch too that you didn't use "nr" in the
> > generic version.  
> 
> And I fixed that too.
> 
> Of course, I didn't test the changes (apart from building it). But
> I've been running the previous version since yesterday, so far no
> issues.

It looks good to me.

Thanks,
Nick



Re: [PATCH v3 3/4] dt-bindings: phy: Add support for QMP phy

2016-12-28 Thread Vivek Gautam
On Thu, Dec 29, 2016 at 4:34 AM, Stephen Boyd  wrote:
> On 12/20, Vivek Gautam wrote:
>> +
>> +Example:
>> + pcie_phy: phy@34000 {
>> + compatible = "qcom,msm8996-qmp-pcie-phy";
>> + reg = <0x034000 0x48f>,
>> + <0x035000 0x5bf>,
>> + <0x036000 0x5bf>,
>> + <0x037000 0x5bf>;
>> + /* tx, rx, pcs */
>> + lane-offsets = <0x0 0x200 0x400>;
>> + #phy-cells = <1>;
>> +
>> + clocks = < GCC_PCIE_PHY_AUX_CLK>,
>> + < GCC_PCIE_PHY_CFG_AHB_CLK>,
>> + < GCC_PCIE_CLKREF_CLK>,
>> + < GCC_PCIE_0_PIPE_CLK>,
>> + < GCC_PCIE_1_PIPE_CLK>,
>> + < GCC_PCIE_2_PIPE_CLK>;
>> + clock-names = "aux", "cfg_ahb", "ref",
>> + "pipe0", "pipe1", "pipe2";
>
> Can we add a #clock-cells = <0> or <1> here given that this is a
> clk provider? We may want to express the clk circular dependency
> between this phy node and GCC via the clocks property at some
> point instead of doing it implicitly via strings in C code.

Sure, will add #clock-cells = <1>.
Although phys like USB and PIPE currently have just the pipe_clk
being controlled by gcc, the UFS phy has tx/rx symbol clocks that
are controlled by gcc but are generated by phy the same way as
pipe_clk.
So, i guess #clock-cells = <1 > makes sense.


Thanks
Vivek
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH v3 3/4] dt-bindings: phy: Add support for QMP phy

2016-12-28 Thread Vivek Gautam
On Thu, Dec 29, 2016 at 4:34 AM, Stephen Boyd  wrote:
> On 12/20, Vivek Gautam wrote:
>> +
>> +Example:
>> + pcie_phy: phy@34000 {
>> + compatible = "qcom,msm8996-qmp-pcie-phy";
>> + reg = <0x034000 0x48f>,
>> + <0x035000 0x5bf>,
>> + <0x036000 0x5bf>,
>> + <0x037000 0x5bf>;
>> + /* tx, rx, pcs */
>> + lane-offsets = <0x0 0x200 0x400>;
>> + #phy-cells = <1>;
>> +
>> + clocks = < GCC_PCIE_PHY_AUX_CLK>,
>> + < GCC_PCIE_PHY_CFG_AHB_CLK>,
>> + < GCC_PCIE_CLKREF_CLK>,
>> + < GCC_PCIE_0_PIPE_CLK>,
>> + < GCC_PCIE_1_PIPE_CLK>,
>> + < GCC_PCIE_2_PIPE_CLK>;
>> + clock-names = "aux", "cfg_ahb", "ref",
>> + "pipe0", "pipe1", "pipe2";
>
> Can we add a #clock-cells = <0> or <1> here given that this is a
> clk provider? We may want to express the clk circular dependency
> between this phy node and GCC via the clocks property at some
> point instead of doing it implicitly via strings in C code.

Sure, will add #clock-cells = <1>.
Although phys like USB and PIPE currently have just the pipe_clk
being controlled by gcc, the UFS phy has tx/rx symbol clocks that
are controlled by gcc but are generated by phy the same way as
pipe_clk.
So, i guess #clock-cells = <1 > makes sense.


Thanks
Vivek
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


[PATCH] mtd: nand: Update dependency of IFC for LS1021A

2016-12-28 Thread Alison Wang
As NAND support for Freescale/NXP IFC controller is available on
LS1021A, the dependency for LS1021A is added.

Signed-off-by: Alison Wang 
---
 drivers/mtd/nand/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mtd/nand/Kconfig b/drivers/mtd/nand/Kconfig
index 353a9dd..85e3860 100644
--- a/drivers/mtd/nand/Kconfig
+++ b/drivers/mtd/nand/Kconfig
@@ -441,7 +441,7 @@ config MTD_NAND_FSL_ELBC
 
 config MTD_NAND_FSL_IFC
tristate "NAND support for Freescale IFC controller"
-   depends on FSL_SOC || ARCH_LAYERSCAPE
+   depends on FSL_SOC || ARCH_LAYERSCAPE || SOC_LS1021A
select FSL_IFC
select MEMORY
help
-- 
2.1.0.27.g96db324



[PATCH] mtd: nand: Update dependency of IFC for LS1021A

2016-12-28 Thread Alison Wang
As NAND support for Freescale/NXP IFC controller is available on
LS1021A, the dependency for LS1021A is added.

Signed-off-by: Alison Wang 
---
 drivers/mtd/nand/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mtd/nand/Kconfig b/drivers/mtd/nand/Kconfig
index 353a9dd..85e3860 100644
--- a/drivers/mtd/nand/Kconfig
+++ b/drivers/mtd/nand/Kconfig
@@ -441,7 +441,7 @@ config MTD_NAND_FSL_ELBC
 
 config MTD_NAND_FSL_IFC
tristate "NAND support for Freescale IFC controller"
-   depends on FSL_SOC || ARCH_LAYERSCAPE
+   depends on FSL_SOC || ARCH_LAYERSCAPE || SOC_LS1021A
select FSL_IFC
select MEMORY
help
-- 
2.1.0.27.g96db324



Re: [PATCH] net: fix incorrect original ingress device index in PKTINFO

2016-12-28 Thread David Ahern
On 12/27/16 12:03 PM, David Miller wrote:
> From: Wei Zhang 
> Date: Tue, 27 Dec 2016 17:52:24 +0800
> 
>> When we send a packet for our own local address on a non-loopback
>> interface (e.g. eth0), due to the change had been introduced from
>> commit 0b922b7a829c ("net: original ingress device index in PKTINFO"), the
>> original ingress device index would be set as the loopback interface.
>> However, the packet should be considered as if it is being arrived via the
>> sending interface (eth0), otherwise it would break the expectation of the
>> userspace application (e.g. the DHCPRELEASE message from dhcp_release
>> binary would be ignored by the dnsmasq daemon, since it come from lo which
>> is not the interface dnsmasq bind to)
>>

Add a Fixes line before the sign-off:

Fixes: 0b922b7a829c ("net: original ingress device index in PKTINFO")
>> Signed-off-by: Wei Zhang 
> 
> When you are fixing a problem introduced by another change, always CC:
> the author of that change as I have done so here.
> 
> David, please take a look at this, thanks.
> 
>> ---
>>  net/ipv4/ip_sockglue.c | 8 +++-
>>  1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
>> index b8a2d63..76d78a7 100644
>> --- a/net/ipv4/ip_sockglue.c
>> +++ b/net/ipv4/ip_sockglue.c
>> @@ -1202,8 +1202,14 @@ void ipv4_pktinfo_prepare(const struct sock *sk, 
>> struct sk_buff *skb)
>>   * which has interface index (iif) as the first member of the
>>   * underlying inet{6}_skb_parm struct. This code then overlays
>>   * PKTINFO_SKB_CB and in_pktinfo also has iif as the first
>> - * element so the iif is picked up from the prior IPCB
>> + * element so the iif is picked up from the prior IPCB except
>> + * iif is loopback interface which the packet should be
>> + * considered as if it is being arrived via the sending
>> + * interface

That comment change could use an adjustment (adjust to fit with in the 80 
columns):

element so the iif is picked up from the prior IPCB. If iif
is the loopback interface, then return the sending interface
(e.g., process binds socket to eth0 for Tx which is redirected
to loopback in the rtable/dst).


>>   */
>> +if (pktinfo->ipi_ifindex == LOOPBACK_IFINDEX)
>> +pktinfo->ipi_ifindex = inet_iif(skb);
>> +
>>  pktinfo->ipi_spec_dst.s_addr = fib_compute_spec_dst(skb);
>>  } else {
>>  pktinfo->ipi_ifindex = 0;


The actual change looks ok to me.

Acked-by: David Ahern 


Re: [PATCH] net: fix incorrect original ingress device index in PKTINFO

2016-12-28 Thread David Ahern
On 12/27/16 12:03 PM, David Miller wrote:
> From: Wei Zhang 
> Date: Tue, 27 Dec 2016 17:52:24 +0800
> 
>> When we send a packet for our own local address on a non-loopback
>> interface (e.g. eth0), due to the change had been introduced from
>> commit 0b922b7a829c ("net: original ingress device index in PKTINFO"), the
>> original ingress device index would be set as the loopback interface.
>> However, the packet should be considered as if it is being arrived via the
>> sending interface (eth0), otherwise it would break the expectation of the
>> userspace application (e.g. the DHCPRELEASE message from dhcp_release
>> binary would be ignored by the dnsmasq daemon, since it come from lo which
>> is not the interface dnsmasq bind to)
>>

Add a Fixes line before the sign-off:

Fixes: 0b922b7a829c ("net: original ingress device index in PKTINFO")
>> Signed-off-by: Wei Zhang 
> 
> When you are fixing a problem introduced by another change, always CC:
> the author of that change as I have done so here.
> 
> David, please take a look at this, thanks.
> 
>> ---
>>  net/ipv4/ip_sockglue.c | 8 +++-
>>  1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
>> index b8a2d63..76d78a7 100644
>> --- a/net/ipv4/ip_sockglue.c
>> +++ b/net/ipv4/ip_sockglue.c
>> @@ -1202,8 +1202,14 @@ void ipv4_pktinfo_prepare(const struct sock *sk, 
>> struct sk_buff *skb)
>>   * which has interface index (iif) as the first member of the
>>   * underlying inet{6}_skb_parm struct. This code then overlays
>>   * PKTINFO_SKB_CB and in_pktinfo also has iif as the first
>> - * element so the iif is picked up from the prior IPCB
>> + * element so the iif is picked up from the prior IPCB except
>> + * iif is loopback interface which the packet should be
>> + * considered as if it is being arrived via the sending
>> + * interface

That comment change could use an adjustment (adjust to fit with in the 80 
columns):

element so the iif is picked up from the prior IPCB. If iif
is the loopback interface, then return the sending interface
(e.g., process binds socket to eth0 for Tx which is redirected
to loopback in the rtable/dst).


>>   */
>> +if (pktinfo->ipi_ifindex == LOOPBACK_IFINDEX)
>> +pktinfo->ipi_ifindex = inet_iif(skb);
>> +
>>  pktinfo->ipi_spec_dst.s_addr = fib_compute_spec_dst(skb);
>>  } else {
>>  pktinfo->ipi_ifindex = 0;


The actual change looks ok to me.

Acked-by: David Ahern 


[PATCH] scsi: mpt3sas: fix hang on ata passthru commands

2016-12-28 Thread Jason Baron
On ata passthru commands scsih_qcmd() ends up spinning in
scsi_wait_for_queuecommand() indefinitely. scsih_qcmd() is called from
__blk_run_queue_uncond() which first increments request_fn_active to a
non-zero value. Thus, scsi_wait_for_queuecommand() never completes because
its spinning waiting for request_fn_active to become 0.

Two patches interact here. The first:

commit 18f6084a989b ("scsi: mpt3sas: Fix secure erase premature
termination") calls scsi_internal_device_block() for ata passthru commands.

The second patch:

commit 669f044170d8 ("scsi: srp_transport: Move queuecommand() wait code
to SCSI core") adds a call to scsi_wait_for_queuecommand() from
scsi_internal_device_block().

Add a new parameter to scsi_internal_device_block() to decide whether
or not to invoke scsi_wait_for_queuecommand().

Signed-off-by: Jason Baron 
Cc: Sathya Prakash 
Cc: Chaitra P B 
Cc: Suganath Prabu Subramani 
Cc: Sreekanth Reddy 
Cc: Hannes Reinecke 
Cc: Martin K. Petersen 
Cc: Bart Van Assche 
Cc: Sagi Grimberg 
Cc: James Bottomley 
Cc: Christoph Hellwig 
Cc: Doug Ledford 
Cc: David Miller 
---
 drivers/scsi/mpt3sas/mpt3sas_base.h  |  2 +-
 drivers/scsi/mpt3sas/mpt3sas_scsih.c |  6 +++---
 drivers/scsi/scsi_lib.c  | 11 +++
 drivers/scsi/scsi_priv.h |  2 +-
 4 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h 
b/drivers/scsi/mpt3sas/mpt3sas_base.h
index 394fe13..5da3427 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.h
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
@@ -1431,7 +1431,7 @@ void mpt3sas_transport_update_links(struct 
MPT3SAS_ADAPTER *ioc,
u64 sas_address, u16 handle, u8 phy_number, u8 link_rate);
 extern struct sas_function_template mpt3sas_transport_functions;
 extern struct scsi_transport_template *mpt3sas_transport_template;
-extern int scsi_internal_device_block(struct scsi_device *sdev);
+extern int scsi_internal_device_block(struct scsi_device *sdev, bool flush);
 extern int scsi_internal_device_unblock(struct scsi_device *sdev,
enum scsi_device_state new_state);
 /* trigger data externs */
diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index b5c966e..509ef8a 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -2839,7 +2839,7 @@ _scsih_internal_device_block(struct scsi_device *sdev,
sas_device_priv_data->sas_target->handle);
sas_device_priv_data->block = 1;
 
-   r = scsi_internal_device_block(sdev);
+   r = scsi_internal_device_block(sdev, true);
if (r == -EINVAL)
sdev_printk(KERN_WARNING, sdev,
"device_block failed with return(%d) for handle(0x%04x)\n",
@@ -2875,7 +2875,7 @@ _scsih_internal_device_unblock(struct scsi_device *sdev,
"performing a block followed by an unblock\n",
r, sas_device_priv_data->sas_target->handle);
sas_device_priv_data->block = 1;
-   r = scsi_internal_device_block(sdev);
+   r = scsi_internal_device_block(sdev, true);
if (r)
sdev_printk(KERN_WARNING, sdev, "retried device_block "
"failed with return(%d) for handle(0x%04x)\n",
@@ -4068,7 +4068,7 @@ scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd 
*scmd)
 * done.
 */
if (ata_12_16_cmd(scmd))
-   scsi_internal_device_block(scmd->device);
+   scsi_internal_device_block(scmd->device, false);
 
sas_device_priv_data = scmd->device->hostdata;
if (!sas_device_priv_data || !sas_device_priv_data->sas_target) {
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index c35b6de..2ee2db9 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2856,9 +2856,11 @@ EXPORT_SYMBOL(scsi_target_resume);
 /**
  * scsi_internal_device_block - internal function to put a device temporarily 
into the SDEV_BLOCK state
  * @sdev:  device to block
+ * @flush: wait for oustanding queuecommand calls to finish
  *
  * Block request made by scsi lld's to temporarily stop all
- * scsi commands on the specified device. May sleep.
+ * scsi commands on the specified device. May sleep if
+ * flush is set
  *
  * Returns zero if successful or error if not
  *
@@ -2873,7 +2875,7 @@ EXPORT_SYMBOL(scsi_target_resume);
  * remove the rport mutex lock and unlock calls from srp_queuecommand().
  */
 int
-scsi_internal_device_block(struct scsi_device *sdev)
+scsi_internal_device_block(struct scsi_device *sdev, bool flush)
 {
struct 

[PATCH] scsi: mpt3sas: fix hang on ata passthru commands

2016-12-28 Thread Jason Baron
On ata passthru commands scsih_qcmd() ends up spinning in
scsi_wait_for_queuecommand() indefinitely. scsih_qcmd() is called from
__blk_run_queue_uncond() which first increments request_fn_active to a
non-zero value. Thus, scsi_wait_for_queuecommand() never completes because
its spinning waiting for request_fn_active to become 0.

Two patches interact here. The first:

commit 18f6084a989b ("scsi: mpt3sas: Fix secure erase premature
termination") calls scsi_internal_device_block() for ata passthru commands.

The second patch:

commit 669f044170d8 ("scsi: srp_transport: Move queuecommand() wait code
to SCSI core") adds a call to scsi_wait_for_queuecommand() from
scsi_internal_device_block().

Add a new parameter to scsi_internal_device_block() to decide whether
or not to invoke scsi_wait_for_queuecommand().

Signed-off-by: Jason Baron 
Cc: Sathya Prakash 
Cc: Chaitra P B 
Cc: Suganath Prabu Subramani 
Cc: Sreekanth Reddy 
Cc: Hannes Reinecke 
Cc: Martin K. Petersen 
Cc: Bart Van Assche 
Cc: Sagi Grimberg 
Cc: James Bottomley 
Cc: Christoph Hellwig 
Cc: Doug Ledford 
Cc: David Miller 
---
 drivers/scsi/mpt3sas/mpt3sas_base.h  |  2 +-
 drivers/scsi/mpt3sas/mpt3sas_scsih.c |  6 +++---
 drivers/scsi/scsi_lib.c  | 11 +++
 drivers/scsi/scsi_priv.h |  2 +-
 4 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h 
b/drivers/scsi/mpt3sas/mpt3sas_base.h
index 394fe13..5da3427 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.h
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
@@ -1431,7 +1431,7 @@ void mpt3sas_transport_update_links(struct 
MPT3SAS_ADAPTER *ioc,
u64 sas_address, u16 handle, u8 phy_number, u8 link_rate);
 extern struct sas_function_template mpt3sas_transport_functions;
 extern struct scsi_transport_template *mpt3sas_transport_template;
-extern int scsi_internal_device_block(struct scsi_device *sdev);
+extern int scsi_internal_device_block(struct scsi_device *sdev, bool flush);
 extern int scsi_internal_device_unblock(struct scsi_device *sdev,
enum scsi_device_state new_state);
 /* trigger data externs */
diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index b5c966e..509ef8a 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -2839,7 +2839,7 @@ _scsih_internal_device_block(struct scsi_device *sdev,
sas_device_priv_data->sas_target->handle);
sas_device_priv_data->block = 1;
 
-   r = scsi_internal_device_block(sdev);
+   r = scsi_internal_device_block(sdev, true);
if (r == -EINVAL)
sdev_printk(KERN_WARNING, sdev,
"device_block failed with return(%d) for handle(0x%04x)\n",
@@ -2875,7 +2875,7 @@ _scsih_internal_device_unblock(struct scsi_device *sdev,
"performing a block followed by an unblock\n",
r, sas_device_priv_data->sas_target->handle);
sas_device_priv_data->block = 1;
-   r = scsi_internal_device_block(sdev);
+   r = scsi_internal_device_block(sdev, true);
if (r)
sdev_printk(KERN_WARNING, sdev, "retried device_block "
"failed with return(%d) for handle(0x%04x)\n",
@@ -4068,7 +4068,7 @@ scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd 
*scmd)
 * done.
 */
if (ata_12_16_cmd(scmd))
-   scsi_internal_device_block(scmd->device);
+   scsi_internal_device_block(scmd->device, false);
 
sas_device_priv_data = scmd->device->hostdata;
if (!sas_device_priv_data || !sas_device_priv_data->sas_target) {
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index c35b6de..2ee2db9 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2856,9 +2856,11 @@ EXPORT_SYMBOL(scsi_target_resume);
 /**
  * scsi_internal_device_block - internal function to put a device temporarily 
into the SDEV_BLOCK state
  * @sdev:  device to block
+ * @flush: wait for oustanding queuecommand calls to finish
  *
  * Block request made by scsi lld's to temporarily stop all
- * scsi commands on the specified device. May sleep.
+ * scsi commands on the specified device. May sleep if
+ * flush is set
  *
  * Returns zero if successful or error if not
  *
@@ -2873,7 +2875,7 @@ EXPORT_SYMBOL(scsi_target_resume);
  * remove the rport mutex lock and unlock calls from srp_queuecommand().
  */
 int
-scsi_internal_device_block(struct scsi_device *sdev)
+scsi_internal_device_block(struct scsi_device *sdev, bool flush)
 {
struct request_queue *q = sdev->request_queue;
unsigned long flags;
@@ -2898,7 +2900,8 @@ scsi_internal_device_block(struct scsi_device *sdev)
spin_lock_irqsave(q->queue_lock, flags);
blk_stop_queue(q);
spin_unlock_irqrestore(q->queue_lock, flags);
-   

Re: [PATCH] i2c: i801: Register optional lis3lv02d i2c device on Dell machines

2016-12-28 Thread Valdis . Kletnieks
On Wed, 28 Dec 2016 15:03:02 +0100, Wolfram Sang said:
> > I have absolutely no idea how to you want to achieve calling that
> > i2c_new_device() registration
> > without kernel patches.
>
> Documentation/i2c/instantiating-devices lists all supported methods.
> Method 4 is userspace instantiation.

I'd be totally OK with userspace doing it, except for the question "How good
will distros be about shipping it"?  I don't have any sense of how good Fedora
and Ubuntu and so on will be about making sure the userspace part is already
done for the user.

Anybody got evidence one way or another?



pgpNJzAVTfopu.pgp
Description: PGP signature


Re: [PATCH] i2c: i801: Register optional lis3lv02d i2c device on Dell machines

2016-12-28 Thread Valdis . Kletnieks
On Wed, 28 Dec 2016 15:03:02 +0100, Wolfram Sang said:
> > I have absolutely no idea how to you want to achieve calling that
> > i2c_new_device() registration
> > without kernel patches.
>
> Documentation/i2c/instantiating-devices lists all supported methods.
> Method 4 is userspace instantiation.

I'd be totally OK with userspace doing it, except for the question "How good
will distros be about shipping it"?  I don't have any sense of how good Fedora
and Ubuntu and so on will be about making sure the userspace part is already
done for the user.

Anybody got evidence one way or another?



pgpNJzAVTfopu.pgp
Description: PGP signature


Re: [PATCH 2/2] mm: add PageWaiters indicating tasks are waiting for a page bit

2016-12-28 Thread Linus Torvalds
On Wed, Dec 28, 2016 at 8:08 PM, Nicholas Piggin  wrote:
>
> Okay. The name could be a bit better though I think, for readability.
> Just a BUILD_BUG_ON if it is not constant and correct bit numbers?

I have a slightly edited patch - moved the comments around and added
some new comments (about both the sign bit, but also about how the
smp_mb() shouldn't be necessary even for the non-atomic fallback).

I also did a BUILD_BUG_ON(), except the other way around - keeping it
about the sign bit in the byte, just just verifying that yes,
PG_waiters is that sign bit.

> BTW. I just notice in your patch too that you didn't use "nr" in the
> generic version.

And I fixed that too.

Of course, I didn't test the changes (apart from building it). But
I've been running the previous version since yesterday, so far no
issues.

Linus
 arch/x86/include/asm/bitops.h | 13 +
 include/linux/page-flags.h|  2 +-
 mm/filemap.c  | 36 +++-
 3 files changed, 45 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
index 68557f52b961..854022772c5b 100644
--- a/arch/x86/include/asm/bitops.h
+++ b/arch/x86/include/asm/bitops.h
@@ -139,6 +139,19 @@ static __always_inline void __clear_bit(long nr, volatile 
unsigned long *addr)
asm volatile("btr %1,%0" : ADDR : "Ir" (nr));
 }
 
+static __always_inline bool clear_bit_unlock_is_negative_byte(long nr, 
volatile unsigned long *addr)
+{
+   bool negative;
+   asm volatile(LOCK_PREFIX "andb %2,%1\n\t"
+   CC_SET(s)
+   : CC_OUT(s) (negative), ADDR
+   : "ir" ((char) ~(1 << nr)) : "memory");
+   return negative;
+}
+
+// Let everybody know we have it
+#define clear_bit_unlock_is_negative_byte clear_bit_unlock_is_negative_byte
+
 /*
  * __clear_bit_unlock - Clears a bit in memory
  * @nr: Bit to clear
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index c56b39890a41..6b5818d6de32 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -73,13 +73,13 @@
  */
 enum pageflags {
PG_locked,  /* Page is locked. Don't touch. */
-   PG_waiters, /* Page has waiters, check its waitqueue */
PG_error,
PG_referenced,
PG_uptodate,
PG_dirty,
PG_lru,
PG_active,
+   PG_waiters, /* Page has waiters, check its waitqueue. Must 
be bit #7 and in the same byte as "PG_locked" */
PG_slab,
PG_owner_priv_1,/* Owner use. If pagecache, fs may use*/
PG_arch_1,
diff --git a/mm/filemap.c b/mm/filemap.c
index 82f26cde830c..6b1d96f86a9c 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -912,6 +912,29 @@ void add_page_wait_queue(struct page *page, wait_queue_t 
*waiter)
 }
 EXPORT_SYMBOL_GPL(add_page_wait_queue);
 
+#ifndef clear_bit_unlock_is_negative_byte
+
+/*
+ * PG_waiters is the high bit in the same byte as PG_lock.
+ *
+ * On x86 (and on many other architectures), we can clear PG_lock and
+ * test the sign bit at the same time. But if the architecture does
+ * not support that special operation, we just do this all by hand
+ * instead.
+ *
+ * The read of PG_waiters has to be after (or concurrently with) PG_locked
+ * being cleared, but a memory barrier should be unneccssary since it is
+ * in the same byte as PG_locked.
+ */
+static inline bool clear_bit_unlock_is_negative_byte(long nr, volatile void 
*mem)
+{
+   clear_bit_unlock(nr, mem);
+   /* smp_mb__after_atomic(); */
+   return test_bit(PG_waiters);
+}
+
+#endif
+
 /**
  * unlock_page - unlock a locked page
  * @page: the page
@@ -921,16 +944,19 @@ EXPORT_SYMBOL_GPL(add_page_wait_queue);
  * mechanism between PageLocked pages and PageWriteback pages is shared.
  * But that's OK - sleepers in wait_on_page_writeback() just go back to sleep.
  *
- * The mb is necessary to enforce ordering between the clear_bit and the read
- * of the waitqueue (to avoid SMP races with a parallel wait_on_page_locked()).
+ * Note that this depends on PG_waiters being the sign bit in the byte
+ * that contains PG_locked - thus the BUILD_BUG_ON(). That allows us to
+ * clear the PG_locked bit and test PG_waiters at the same time fairly
+ * portably (architectures that do LL/SC can test any bit, while x86 can
+ * test the sign bit).
  */
 void unlock_page(struct page *page)
 {
+   BUILD_BUG_ON(PG_waiters != 7);
page = compound_head(page);
VM_BUG_ON_PAGE(!PageLocked(page), page);
-   clear_bit_unlock(PG_locked, >flags);
-   smp_mb__after_atomic();
-   wake_up_page(page, PG_locked);
+   if (clear_bit_unlock_is_negative_byte(PG_locked, >flags))
+   wake_up_page_bit(page, PG_locked);
 }
 EXPORT_SYMBOL(unlock_page);
 


Re: [PATCH 2/2] mm: add PageWaiters indicating tasks are waiting for a page bit

2016-12-28 Thread Linus Torvalds
On Wed, Dec 28, 2016 at 8:08 PM, Nicholas Piggin  wrote:
>
> Okay. The name could be a bit better though I think, for readability.
> Just a BUILD_BUG_ON if it is not constant and correct bit numbers?

I have a slightly edited patch - moved the comments around and added
some new comments (about both the sign bit, but also about how the
smp_mb() shouldn't be necessary even for the non-atomic fallback).

I also did a BUILD_BUG_ON(), except the other way around - keeping it
about the sign bit in the byte, just just verifying that yes,
PG_waiters is that sign bit.

> BTW. I just notice in your patch too that you didn't use "nr" in the
> generic version.

And I fixed that too.

Of course, I didn't test the changes (apart from building it). But
I've been running the previous version since yesterday, so far no
issues.

Linus
 arch/x86/include/asm/bitops.h | 13 +
 include/linux/page-flags.h|  2 +-
 mm/filemap.c  | 36 +++-
 3 files changed, 45 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
index 68557f52b961..854022772c5b 100644
--- a/arch/x86/include/asm/bitops.h
+++ b/arch/x86/include/asm/bitops.h
@@ -139,6 +139,19 @@ static __always_inline void __clear_bit(long nr, volatile 
unsigned long *addr)
asm volatile("btr %1,%0" : ADDR : "Ir" (nr));
 }
 
+static __always_inline bool clear_bit_unlock_is_negative_byte(long nr, 
volatile unsigned long *addr)
+{
+   bool negative;
+   asm volatile(LOCK_PREFIX "andb %2,%1\n\t"
+   CC_SET(s)
+   : CC_OUT(s) (negative), ADDR
+   : "ir" ((char) ~(1 << nr)) : "memory");
+   return negative;
+}
+
+// Let everybody know we have it
+#define clear_bit_unlock_is_negative_byte clear_bit_unlock_is_negative_byte
+
 /*
  * __clear_bit_unlock - Clears a bit in memory
  * @nr: Bit to clear
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index c56b39890a41..6b5818d6de32 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -73,13 +73,13 @@
  */
 enum pageflags {
PG_locked,  /* Page is locked. Don't touch. */
-   PG_waiters, /* Page has waiters, check its waitqueue */
PG_error,
PG_referenced,
PG_uptodate,
PG_dirty,
PG_lru,
PG_active,
+   PG_waiters, /* Page has waiters, check its waitqueue. Must 
be bit #7 and in the same byte as "PG_locked" */
PG_slab,
PG_owner_priv_1,/* Owner use. If pagecache, fs may use*/
PG_arch_1,
diff --git a/mm/filemap.c b/mm/filemap.c
index 82f26cde830c..6b1d96f86a9c 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -912,6 +912,29 @@ void add_page_wait_queue(struct page *page, wait_queue_t 
*waiter)
 }
 EXPORT_SYMBOL_GPL(add_page_wait_queue);
 
+#ifndef clear_bit_unlock_is_negative_byte
+
+/*
+ * PG_waiters is the high bit in the same byte as PG_lock.
+ *
+ * On x86 (and on many other architectures), we can clear PG_lock and
+ * test the sign bit at the same time. But if the architecture does
+ * not support that special operation, we just do this all by hand
+ * instead.
+ *
+ * The read of PG_waiters has to be after (or concurrently with) PG_locked
+ * being cleared, but a memory barrier should be unneccssary since it is
+ * in the same byte as PG_locked.
+ */
+static inline bool clear_bit_unlock_is_negative_byte(long nr, volatile void 
*mem)
+{
+   clear_bit_unlock(nr, mem);
+   /* smp_mb__after_atomic(); */
+   return test_bit(PG_waiters);
+}
+
+#endif
+
 /**
  * unlock_page - unlock a locked page
  * @page: the page
@@ -921,16 +944,19 @@ EXPORT_SYMBOL_GPL(add_page_wait_queue);
  * mechanism between PageLocked pages and PageWriteback pages is shared.
  * But that's OK - sleepers in wait_on_page_writeback() just go back to sleep.
  *
- * The mb is necessary to enforce ordering between the clear_bit and the read
- * of the waitqueue (to avoid SMP races with a parallel wait_on_page_locked()).
+ * Note that this depends on PG_waiters being the sign bit in the byte
+ * that contains PG_locked - thus the BUILD_BUG_ON(). That allows us to
+ * clear the PG_locked bit and test PG_waiters at the same time fairly
+ * portably (architectures that do LL/SC can test any bit, while x86 can
+ * test the sign bit).
  */
 void unlock_page(struct page *page)
 {
+   BUILD_BUG_ON(PG_waiters != 7);
page = compound_head(page);
VM_BUG_ON_PAGE(!PageLocked(page), page);
-   clear_bit_unlock(PG_locked, >flags);
-   smp_mb__after_atomic();
-   wake_up_page(page, PG_locked);
+   if (clear_bit_unlock_is_negative_byte(PG_locked, >flags))
+   wake_up_page_bit(page, PG_locked);
 }
 EXPORT_SYMBOL(unlock_page);
 


4.9-rt: uprev notes, incremental uprev, adding bisection ability

2016-12-28 Thread Paul Gortmaker
TL;DR: incremental -rt patch uprev from 4.8-rt to the new 4.9-rt
covering 175 touch down points on mainline merges & tags by Linus;
useful for rt developers and bugfixers to research and bisect with.

If you just want the bisect points generated on your machine, then:

mkdir rt-test
cd rt-test
# clone mainline or clone from a local copy with >=4.9 tags in it.
git clone /path/to/my/existing/mainline/linux linux-rt-bisect
git clone 
git://git.kernel.org/pub/scm/linux/kernel/git/paulg/4.9-rt-patches.git

cd linux-rt-bisect
../4.9-rt-patches/scripts/create-branches.sh

  -- end TL;DR -- 8<  end TL;DR --

Now that 4.9-rt1 is out there, we can do what was done[1] for 4.8-rt
and create an incremental uprev to get an idea of what mainline
merges impacted the patches in the preempt-rt series to bring more
clarity into what conflicts arose:

https://git.kernel.org/cgit/linux/kernel/git/paulg/4.9-rt-patches.git

The usual suspects have impact to rt:  scheduler changes in tip, x86
changes in tip, CPU hotplug (this time) in tip, large "patch bombs"
from akpm, and even somewhat net-next content from davem.


Points of interest:
---

lglocks:  -removed from mainline, so we get to drop the -rt
extensions to that support.  However -rt specific users need
to be converted to something else; e.g. see the new /* XXX */
in cpu_stopper_thread, where an lglock used to be used.

thread_info: -x86 dabbled with THREAD_INFO_IN_TASK then disabled it
again before rc2.  With the benefit of hindsight, we can
take the disable commit and temporarily backport it so that
we essentially never have it in task, which means saving a
whole lot of mess in the preempt_lazy patches for nothing.
(see Add / Delete patch marked with "*" at end of this post)

new/removed patches:  -there were six new patches added and 16
patches removed.  A couple of the "removed" patches were not
really removed, but instead squished as fixups into another
related patch.

Further details can be found at the end of this post, where the merges
and the patches they impacted are listed.  A majority of the changes
are just basic context refresh -- updating the patch to match the
changes in the surrounding code without changing what the patch
really is doing.

One can also inspect the commit logs in the above git repo since the
baseline v4.8.15-rt10 commit.

Sanity boot testing was done on RT_FULL for x86-64 along the way on
selected "high risk" merges, like those listed above.  The final
result is that the patches at the end of the incremental uprev here
match the patches of v4.9-rt1 just recently released.


Bisecting to find new issues in 4.9-rt:
---

The other benefit of this is that it allows one to quasi-bisect the
preempt-rt (treated as a whole) across the some 15,000 patches that
make up the new 4.9 content.  This allowed me to solve several bugs
in 4.8-rt that I would have had an extrememly hard time to solve in
any other way.  I won't repeat every detail of how this process
makes sense, since I've documented that in the past[2].

The 15,000 odd commits of 4.9 made it to mainline via ~165 merges
from Linus between 4.8 and v4.9-rc1 (the merge window) where we can
apply the -rt patches and see which feature merges cause disruption
to the -rt patches.

Not every feature/maintainer-merge causes issues with the -rt patch
queue: I had to create just over 50 commits to the v4.8.15-rt10
baseline in order to have about 30 possible series of patches to
cover all those 165 merges from 4.8 to mainline 4.9-rc1.

Since Linux development has a lot less commits in rc2, rc3, ...
rc8, and final 4.9 -- those tags represent the remaining bisection
points that can be created and used from this repo.  I didn't do
per-merge patch application testng post-rc1 (just as was done for
the 4.8-rt incremental uprev earlier) since it would be pointless.

 ---

Quick recap on how to use this repo to do merge level bisect:

1) Full list of feature merges in mainline repo you can test on:
git log --oneline --merges --author=Torvalds  \
--reverse ^v4.8 v4.9-rc1

2) Tags in this 4.9-rt patch repo look like:
rt-v4.8-101-g7af8a0f80888
rt-v4.8-304-g4b978934a440
rt-v4.8-373-g00bcf5cdd6c0
rt-v4.8-558-ge606d81d2d95
rt-v4.8-627-gaf79ad2b1f33
 [...]
rt-v4.8-12632-ga379f71a30dd
rt-v4.8-14088-g6b25e21fa6f2
rt-v4.8-15054-g9ffc66941df2
rt-v4.9-rc1
rt-v4.9-rc2
 [...]
rt-v4.9-rc8
rt-v4.9

3) Say you want to test -rt on this merge from #1:
"af79ad2b1f33 Merge branch 'sched-core-for-linus' ..."
   You run "git describe af79ad2b1f33" and get:
v4.8-627-gaf79ad2b1f33
   No problem, you check out the patches tagged in #2 with the
   matching name.  These will apply to that merge as a 

4.9-rt: uprev notes, incremental uprev, adding bisection ability

2016-12-28 Thread Paul Gortmaker
TL;DR: incremental -rt patch uprev from 4.8-rt to the new 4.9-rt
covering 175 touch down points on mainline merges & tags by Linus;
useful for rt developers and bugfixers to research and bisect with.

If you just want the bisect points generated on your machine, then:

mkdir rt-test
cd rt-test
# clone mainline or clone from a local copy with >=4.9 tags in it.
git clone /path/to/my/existing/mainline/linux linux-rt-bisect
git clone 
git://git.kernel.org/pub/scm/linux/kernel/git/paulg/4.9-rt-patches.git

cd linux-rt-bisect
../4.9-rt-patches/scripts/create-branches.sh

  -- end TL;DR -- 8<  end TL;DR --

Now that 4.9-rt1 is out there, we can do what was done[1] for 4.8-rt
and create an incremental uprev to get an idea of what mainline
merges impacted the patches in the preempt-rt series to bring more
clarity into what conflicts arose:

https://git.kernel.org/cgit/linux/kernel/git/paulg/4.9-rt-patches.git

The usual suspects have impact to rt:  scheduler changes in tip, x86
changes in tip, CPU hotplug (this time) in tip, large "patch bombs"
from akpm, and even somewhat net-next content from davem.


Points of interest:
---

lglocks:  -removed from mainline, so we get to drop the -rt
extensions to that support.  However -rt specific users need
to be converted to something else; e.g. see the new /* XXX */
in cpu_stopper_thread, where an lglock used to be used.

thread_info: -x86 dabbled with THREAD_INFO_IN_TASK then disabled it
again before rc2.  With the benefit of hindsight, we can
take the disable commit and temporarily backport it so that
we essentially never have it in task, which means saving a
whole lot of mess in the preempt_lazy patches for nothing.
(see Add / Delete patch marked with "*" at end of this post)

new/removed patches:  -there were six new patches added and 16
patches removed.  A couple of the "removed" patches were not
really removed, but instead squished as fixups into another
related patch.

Further details can be found at the end of this post, where the merges
and the patches they impacted are listed.  A majority of the changes
are just basic context refresh -- updating the patch to match the
changes in the surrounding code without changing what the patch
really is doing.

One can also inspect the commit logs in the above git repo since the
baseline v4.8.15-rt10 commit.

Sanity boot testing was done on RT_FULL for x86-64 along the way on
selected "high risk" merges, like those listed above.  The final
result is that the patches at the end of the incremental uprev here
match the patches of v4.9-rt1 just recently released.


Bisecting to find new issues in 4.9-rt:
---

The other benefit of this is that it allows one to quasi-bisect the
preempt-rt (treated as a whole) across the some 15,000 patches that
make up the new 4.9 content.  This allowed me to solve several bugs
in 4.8-rt that I would have had an extrememly hard time to solve in
any other way.  I won't repeat every detail of how this process
makes sense, since I've documented that in the past[2].

The 15,000 odd commits of 4.9 made it to mainline via ~165 merges
from Linus between 4.8 and v4.9-rc1 (the merge window) where we can
apply the -rt patches and see which feature merges cause disruption
to the -rt patches.

Not every feature/maintainer-merge causes issues with the -rt patch
queue: I had to create just over 50 commits to the v4.8.15-rt10
baseline in order to have about 30 possible series of patches to
cover all those 165 merges from 4.8 to mainline 4.9-rc1.

Since Linux development has a lot less commits in rc2, rc3, ...
rc8, and final 4.9 -- those tags represent the remaining bisection
points that can be created and used from this repo.  I didn't do
per-merge patch application testng post-rc1 (just as was done for
the 4.8-rt incremental uprev earlier) since it would be pointless.

 ---

Quick recap on how to use this repo to do merge level bisect:

1) Full list of feature merges in mainline repo you can test on:
git log --oneline --merges --author=Torvalds  \
--reverse ^v4.8 v4.9-rc1

2) Tags in this 4.9-rt patch repo look like:
rt-v4.8-101-g7af8a0f80888
rt-v4.8-304-g4b978934a440
rt-v4.8-373-g00bcf5cdd6c0
rt-v4.8-558-ge606d81d2d95
rt-v4.8-627-gaf79ad2b1f33
 [...]
rt-v4.8-12632-ga379f71a30dd
rt-v4.8-14088-g6b25e21fa6f2
rt-v4.8-15054-g9ffc66941df2
rt-v4.9-rc1
rt-v4.9-rc2
 [...]
rt-v4.9-rc8
rt-v4.9

3) Say you want to test -rt on this merge from #1:
"af79ad2b1f33 Merge branch 'sched-core-for-linus' ..."
   You run "git describe af79ad2b1f33" and get:
v4.8-627-gaf79ad2b1f33
   No problem, you check out the patches tagged in #2 with the
   matching name.  These will apply to that merge as a 

Re: [PATCH 2/2] mm: add PageWaiters indicating tasks are waiting for a page bit

2016-12-28 Thread Nicholas Piggin
On Wed, 28 Dec 2016 11:17:00 -0800
Linus Torvalds  wrote:

> On Tue, Dec 27, 2016 at 7:53 PM, Nicholas Piggin  wrote:
> >>
> >> Yeah, that patch is disgusting, and doesn't even help x86.  
> >
> > No, although it would help some cases (but granted the bitops tend to
> > be problematic in this regard). To be clear I'm not asking to merge it,
> > just wondered your opinion. (We need something more for unlock_page
> > anyway because the memory barrier in the way).  
> 
> The thing is, the patch seems pointless anyway. The "add_return()"
> kind of cases already return the value, so any code that cares can
> just use that. And the other cases are downright incorrect, like the
> removal of "volatile" from the bit test ops.

Yeah that's true, but you can't carry that over multiple multiple
primitives. For bitops it's often the case you get several bitops
on the same word close together.

> 
> >> It also
> >> depends on the compiler doing the right thing in ways that are not
> >> obviously true.  
> >
> > Can you elaborate on this? GCC will do the optimization (modulo a
> > regression https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77647)  
> 
> So the removal of volatile is just one example of that. You're
> basically forcing magical side effects. I've never seen this trick
> _documented_, and quite frankly, the gcc people have had a history of
> changing their own documentation when it came to their own extensions
> (ie they've changed how inline functions work etc).
> 
> But I also worry about interactions with different gcc versions, or
> with the LLVM people who try to build the kernel with non-gcc
> compilers.
> 
> Finally, it fundamentally can't work on x86 anyway, except for the
> add-return type of operations, which by definitions are pointless (see
> above).
> 
> So everything just screams "this is a horrible approach" to me.

You're probably right. The few cases where it matters may just be served
with special primitives.

> 
> > Patch seems okay, but it's kind of a horrible primitive. What if you
> > did clear_bit_unlock_and_test_bit, which does a __builtin_constant_p
> > test on the bit numbers and if they are < 7 and == 7, then do the
> > fastpath?  
> 
> So the problem with that is that it makes no sense *except* in the
> case where the bit is 7. So why add a "generic" function for something
> that really isn't generic?

Yeah you're also right, I kind of realized after hitting send.

> 
> I agree that it's a hacky interface, but I also happen to believe that
> being explicit about what you are actually doing causes less pain.
> It's not magical, and it's not subtle. There's no question about what
> it does behind your back, and people won't use it by mistake in the
> wrong context where it doesn't actually work any better than just
> doing the obvious thing.

Okay. The name could be a bit better though I think, for readability.
Just a BUILD_BUG_ON if it is not constant and correct bit numbers?

BTW. I just notice in your patch too that you didn't use "nr" in the
generic version.

Thanks,
Nick


Re: [PATCH 2/2] mm: add PageWaiters indicating tasks are waiting for a page bit

2016-12-28 Thread Nicholas Piggin
On Wed, 28 Dec 2016 11:17:00 -0800
Linus Torvalds  wrote:

> On Tue, Dec 27, 2016 at 7:53 PM, Nicholas Piggin  wrote:
> >>
> >> Yeah, that patch is disgusting, and doesn't even help x86.  
> >
> > No, although it would help some cases (but granted the bitops tend to
> > be problematic in this regard). To be clear I'm not asking to merge it,
> > just wondered your opinion. (We need something more for unlock_page
> > anyway because the memory barrier in the way).  
> 
> The thing is, the patch seems pointless anyway. The "add_return()"
> kind of cases already return the value, so any code that cares can
> just use that. And the other cases are downright incorrect, like the
> removal of "volatile" from the bit test ops.

Yeah that's true, but you can't carry that over multiple multiple
primitives. For bitops it's often the case you get several bitops
on the same word close together.

> 
> >> It also
> >> depends on the compiler doing the right thing in ways that are not
> >> obviously true.  
> >
> > Can you elaborate on this? GCC will do the optimization (modulo a
> > regression https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77647)  
> 
> So the removal of volatile is just one example of that. You're
> basically forcing magical side effects. I've never seen this trick
> _documented_, and quite frankly, the gcc people have had a history of
> changing their own documentation when it came to their own extensions
> (ie they've changed how inline functions work etc).
> 
> But I also worry about interactions with different gcc versions, or
> with the LLVM people who try to build the kernel with non-gcc
> compilers.
> 
> Finally, it fundamentally can't work on x86 anyway, except for the
> add-return type of operations, which by definitions are pointless (see
> above).
> 
> So everything just screams "this is a horrible approach" to me.

You're probably right. The few cases where it matters may just be served
with special primitives.

> 
> > Patch seems okay, but it's kind of a horrible primitive. What if you
> > did clear_bit_unlock_and_test_bit, which does a __builtin_constant_p
> > test on the bit numbers and if they are < 7 and == 7, then do the
> > fastpath?  
> 
> So the problem with that is that it makes no sense *except* in the
> case where the bit is 7. So why add a "generic" function for something
> that really isn't generic?

Yeah you're also right, I kind of realized after hitting send.

> 
> I agree that it's a hacky interface, but I also happen to believe that
> being explicit about what you are actually doing causes less pain.
> It's not magical, and it's not subtle. There's no question about what
> it does behind your back, and people won't use it by mistake in the
> wrong context where it doesn't actually work any better than just
> doing the obvious thing.

Okay. The name could be a bit better though I think, for readability.
Just a BUILD_BUG_ON if it is not constant and correct bit numbers?

BTW. I just notice in your patch too that you didn't use "nr" in the
generic version.

Thanks,
Nick


Re: [PATCH] cpufreq: schedutil: add up/down frequency transition rate limits

2016-12-28 Thread Wanpeng Li
2016-11-21 20:26 GMT+08:00 Peter Zijlstra :
> On Mon, Nov 21, 2016 at 12:14:32PM +, Juri Lelli wrote:
>> On 21/11/16 11:19, Peter Zijlstra wrote:
>
>> > So no tunables and rate limits here at all please.
>> >
>> > During LPC we discussed the rampup and decay issues and decided that we
>> > should very much first address them by playing with the PELT stuff.
>> > Morton was going to play with capping the decay on the util signal. This
>> > should greatly improve the ramp-up scenario and cure some other wobbles.
>> >
>> > The decay can be set by changing the over-all pelt decay, if so desired.
>> >
>>
>> Do you mean we might want to change the decay (make it different from
>> ramp-up) once for all, or maybe we make it tunable so that we can
>> address different power/perf requirements?
>
> So the limited decay would be the dominant factor in ramp-up time,
> leaving the regular PELT period the dominant factor for ramp-down.
>
> (Note that the decay limit would only be applied on the per-task signal,
> not the accumulated signal.)

What's the meaning of "signal" in this thread?

Regards,
Wanpeng Li


Re: [PATCH] cpufreq: schedutil: add up/down frequency transition rate limits

2016-12-28 Thread Wanpeng Li
2016-11-21 20:26 GMT+08:00 Peter Zijlstra :
> On Mon, Nov 21, 2016 at 12:14:32PM +, Juri Lelli wrote:
>> On 21/11/16 11:19, Peter Zijlstra wrote:
>
>> > So no tunables and rate limits here at all please.
>> >
>> > During LPC we discussed the rampup and decay issues and decided that we
>> > should very much first address them by playing with the PELT stuff.
>> > Morton was going to play with capping the decay on the util signal. This
>> > should greatly improve the ramp-up scenario and cure some other wobbles.
>> >
>> > The decay can be set by changing the over-all pelt decay, if so desired.
>> >
>>
>> Do you mean we might want to change the decay (make it different from
>> ramp-up) once for all, or maybe we make it tunable so that we can
>> address different power/perf requirements?
>
> So the limited decay would be the dominant factor in ramp-up time,
> leaving the regular PELT period the dominant factor for ramp-down.
>
> (Note that the decay limit would only be applied on the per-task signal,
> not the accumulated signal.)

What's the meaning of "signal" in this thread?

Regards,
Wanpeng Li


Re: [PATCH] ceph: cleanup ACCESS_ONCE -> READ_ONCE

2016-12-28 Thread Yan, Zheng

> On 26 Dec 2016, at 17:26, Seraphime Kirkovski  wrote:
> 
> This removes the uses of ACCESS_ONCE in favor of READ_ONCE
> 
> Signed-off-by: Seraphime Kirkovski 
> ---
> fs/ceph/addr.c   |  4 ++--
> fs/ceph/caps.c   |  2 +-
> fs/ceph/dir.c|  2 +-
> fs/ceph/inode.c  |  2 +-
> fs/ceph/mds_client.c | 10 +-
> 5 files changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
> index 9cd0c0e..e61bc3f 100644
> --- a/fs/ceph/addr.c
> +++ b/fs/ceph/addr.c
> @@ -771,7 +771,7 @@ static int ceph_writepages_start(struct address_space 
> *mapping,
>wbc->sync_mode == WB_SYNC_NONE ? "NONE" :
>(wbc->sync_mode == WB_SYNC_ALL ? "ALL" : "HOLD"));
> 
> - if (ACCESS_ONCE(fsc->mount_state) == CEPH_MOUNT_SHUTDOWN) {
> + if (READ_ONCE(fsc->mount_state) == CEPH_MOUNT_SHUTDOWN) {
>   if (ci->i_wrbuffer_ref > 0) {
>   pr_warn_ratelimited(
>   "writepage_start %p %lld forced umount\n",
> @@ -1194,7 +1194,7 @@ static int ceph_update_writeable_page(struct file *file,
>   int r;
>   struct ceph_snap_context *snapc, *oldest;
> 
> - if (ACCESS_ONCE(fsc->mount_state) == CEPH_MOUNT_SHUTDOWN) {
> + if (READ_ONCE(fsc->mount_state) == CEPH_MOUNT_SHUTDOWN) {
>   dout(" page %p forced umount\n", page);
>   unlock_page(page);
>   return -EIO;
> diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
> index baea866..193ba82 100644
> --- a/fs/ceph/caps.c
> +++ b/fs/ceph/caps.c
> @@ -2477,7 +2477,7 @@ static int try_get_cap_refs(struct ceph_inode_info *ci, 
> int need, int want,
> 
>   if (ci->i_ceph_flags & CEPH_I_CAP_DROPPED) {
>   int mds_wanted;
> - if (ACCESS_ONCE(mdsc->fsc->mount_state) ==
> + if (READ_ONCE(mdsc->fsc->mount_state) ==
>   CEPH_MOUNT_SHUTDOWN) {
>   dout("get_cap_refs %p forced umount\n", inode);
>   *err = -EIO;
> diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
> index d7a9369..cd99b26 100644
> --- a/fs/ceph/dir.c
> +++ b/fs/ceph/dir.c
> @@ -1194,7 +1194,7 @@ static int ceph_d_revalidate(struct dentry *dentry, 
> unsigned int flags)
>   struct inode *dir;
> 
>   if (flags & LOOKUP_RCU) {
> - parent = ACCESS_ONCE(dentry->d_parent);
> + parent = READ_ONCE(dentry->d_parent);
>   dir = d_inode_rcu(parent);
>   if (!dir)
>   return -ECHILD;
> diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
> index 398e532..1476f54 100644
> --- a/fs/ceph/inode.c
> +++ b/fs/ceph/inode.c
> @@ -1719,7 +1719,7 @@ static void ceph_invalidate_work(struct work_struct 
> *work)
> 
>   mutex_lock(>i_truncate_mutex);
> 
> - if (ACCESS_ONCE(fsc->mount_state) == CEPH_MOUNT_SHUTDOWN) {
> + if (READ_ONCE(fsc->mount_state) == CEPH_MOUNT_SHUTDOWN) {
>   pr_warn_ratelimited("invalidate_pages %p %lld forced umount\n",
>   inode, ceph_ino(inode));
>   mapping_set_error(inode->i_mapping, -EIO);
> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
> index 4f49253..df90794 100644
> --- a/fs/ceph/mds_client.c
> +++ b/fs/ceph/mds_client.c
> @@ -1145,7 +1145,7 @@ static int remove_session_caps_cb(struct inode *inode, 
> struct ceph_cap *cap,
>   ci->i_ceph_flags |= CEPH_I_CAP_DROPPED;
> 
>   if (ci->i_wrbuffer_ref > 0 &&
> - ACCESS_ONCE(fsc->mount_state) == CEPH_MOUNT_SHUTDOWN)
> + READ_ONCE(fsc->mount_state) == CEPH_MOUNT_SHUTDOWN)
>   invalidate = true;
> 
>   while (!list_empty(>i_cap_flush_list)) {
> @@ -2095,12 +2095,12 @@ static int __do_request(struct ceph_mds_client *mdsc,
>   err = -EIO;
>   goto finish;
>   }
> - if (ACCESS_ONCE(mdsc->fsc->mount_state) == CEPH_MOUNT_SHUTDOWN) {
> + if (READ_ONCE(mdsc->fsc->mount_state) == CEPH_MOUNT_SHUTDOWN) {
>   dout("do_request forced umount\n");
>   err = -EIO;
>   goto finish;
>   }
> - if (ACCESS_ONCE(mdsc->fsc->mount_state) == CEPH_MOUNT_MOUNTING) {
> + if (READ_ONCE(mdsc->fsc->mount_state) == CEPH_MOUNT_MOUNTING) {
>   if (mdsc->mdsmap_err) {
>   err = mdsc->mdsmap_err;
>   dout("do_request mdsmap err %d\n", err);
> @@ -3550,7 +3550,7 @@ void ceph_mdsc_sync(struct ceph_mds_client *mdsc)
> {
>   u64 want_tid, want_flush;
> 
> - if (ACCESS_ONCE(mdsc->fsc->mount_state) == CEPH_MOUNT_SHUTDOWN)
> + if (READ_ONCE(mdsc->fsc->mount_state) == CEPH_MOUNT_SHUTDOWN)
>   return;
> 
>   dout("sync\n");
> @@ -3581,7 +3581,7 @@ void ceph_mdsc_sync(struct ceph_mds_client *mdsc)
>  */
> static bool done_closing_sessions(struct ceph_mds_client *mdsc, int skipped)
> {

Re: [PATCH] ceph: cleanup ACCESS_ONCE -> READ_ONCE

2016-12-28 Thread Yan, Zheng

> On 26 Dec 2016, at 17:26, Seraphime Kirkovski  wrote:
> 
> This removes the uses of ACCESS_ONCE in favor of READ_ONCE
> 
> Signed-off-by: Seraphime Kirkovski 
> ---
> fs/ceph/addr.c   |  4 ++--
> fs/ceph/caps.c   |  2 +-
> fs/ceph/dir.c|  2 +-
> fs/ceph/inode.c  |  2 +-
> fs/ceph/mds_client.c | 10 +-
> 5 files changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
> index 9cd0c0e..e61bc3f 100644
> --- a/fs/ceph/addr.c
> +++ b/fs/ceph/addr.c
> @@ -771,7 +771,7 @@ static int ceph_writepages_start(struct address_space 
> *mapping,
>wbc->sync_mode == WB_SYNC_NONE ? "NONE" :
>(wbc->sync_mode == WB_SYNC_ALL ? "ALL" : "HOLD"));
> 
> - if (ACCESS_ONCE(fsc->mount_state) == CEPH_MOUNT_SHUTDOWN) {
> + if (READ_ONCE(fsc->mount_state) == CEPH_MOUNT_SHUTDOWN) {
>   if (ci->i_wrbuffer_ref > 0) {
>   pr_warn_ratelimited(
>   "writepage_start %p %lld forced umount\n",
> @@ -1194,7 +1194,7 @@ static int ceph_update_writeable_page(struct file *file,
>   int r;
>   struct ceph_snap_context *snapc, *oldest;
> 
> - if (ACCESS_ONCE(fsc->mount_state) == CEPH_MOUNT_SHUTDOWN) {
> + if (READ_ONCE(fsc->mount_state) == CEPH_MOUNT_SHUTDOWN) {
>   dout(" page %p forced umount\n", page);
>   unlock_page(page);
>   return -EIO;
> diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
> index baea866..193ba82 100644
> --- a/fs/ceph/caps.c
> +++ b/fs/ceph/caps.c
> @@ -2477,7 +2477,7 @@ static int try_get_cap_refs(struct ceph_inode_info *ci, 
> int need, int want,
> 
>   if (ci->i_ceph_flags & CEPH_I_CAP_DROPPED) {
>   int mds_wanted;
> - if (ACCESS_ONCE(mdsc->fsc->mount_state) ==
> + if (READ_ONCE(mdsc->fsc->mount_state) ==
>   CEPH_MOUNT_SHUTDOWN) {
>   dout("get_cap_refs %p forced umount\n", inode);
>   *err = -EIO;
> diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
> index d7a9369..cd99b26 100644
> --- a/fs/ceph/dir.c
> +++ b/fs/ceph/dir.c
> @@ -1194,7 +1194,7 @@ static int ceph_d_revalidate(struct dentry *dentry, 
> unsigned int flags)
>   struct inode *dir;
> 
>   if (flags & LOOKUP_RCU) {
> - parent = ACCESS_ONCE(dentry->d_parent);
> + parent = READ_ONCE(dentry->d_parent);
>   dir = d_inode_rcu(parent);
>   if (!dir)
>   return -ECHILD;
> diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
> index 398e532..1476f54 100644
> --- a/fs/ceph/inode.c
> +++ b/fs/ceph/inode.c
> @@ -1719,7 +1719,7 @@ static void ceph_invalidate_work(struct work_struct 
> *work)
> 
>   mutex_lock(>i_truncate_mutex);
> 
> - if (ACCESS_ONCE(fsc->mount_state) == CEPH_MOUNT_SHUTDOWN) {
> + if (READ_ONCE(fsc->mount_state) == CEPH_MOUNT_SHUTDOWN) {
>   pr_warn_ratelimited("invalidate_pages %p %lld forced umount\n",
>   inode, ceph_ino(inode));
>   mapping_set_error(inode->i_mapping, -EIO);
> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
> index 4f49253..df90794 100644
> --- a/fs/ceph/mds_client.c
> +++ b/fs/ceph/mds_client.c
> @@ -1145,7 +1145,7 @@ static int remove_session_caps_cb(struct inode *inode, 
> struct ceph_cap *cap,
>   ci->i_ceph_flags |= CEPH_I_CAP_DROPPED;
> 
>   if (ci->i_wrbuffer_ref > 0 &&
> - ACCESS_ONCE(fsc->mount_state) == CEPH_MOUNT_SHUTDOWN)
> + READ_ONCE(fsc->mount_state) == CEPH_MOUNT_SHUTDOWN)
>   invalidate = true;
> 
>   while (!list_empty(>i_cap_flush_list)) {
> @@ -2095,12 +2095,12 @@ static int __do_request(struct ceph_mds_client *mdsc,
>   err = -EIO;
>   goto finish;
>   }
> - if (ACCESS_ONCE(mdsc->fsc->mount_state) == CEPH_MOUNT_SHUTDOWN) {
> + if (READ_ONCE(mdsc->fsc->mount_state) == CEPH_MOUNT_SHUTDOWN) {
>   dout("do_request forced umount\n");
>   err = -EIO;
>   goto finish;
>   }
> - if (ACCESS_ONCE(mdsc->fsc->mount_state) == CEPH_MOUNT_MOUNTING) {
> + if (READ_ONCE(mdsc->fsc->mount_state) == CEPH_MOUNT_MOUNTING) {
>   if (mdsc->mdsmap_err) {
>   err = mdsc->mdsmap_err;
>   dout("do_request mdsmap err %d\n", err);
> @@ -3550,7 +3550,7 @@ void ceph_mdsc_sync(struct ceph_mds_client *mdsc)
> {
>   u64 want_tid, want_flush;
> 
> - if (ACCESS_ONCE(mdsc->fsc->mount_state) == CEPH_MOUNT_SHUTDOWN)
> + if (READ_ONCE(mdsc->fsc->mount_state) == CEPH_MOUNT_SHUTDOWN)
>   return;
> 
>   dout("sync\n");
> @@ -3581,7 +3581,7 @@ void ceph_mdsc_sync(struct ceph_mds_client *mdsc)
>  */
> static bool done_closing_sessions(struct ceph_mds_client *mdsc, int skipped)
> {
> - if 

Re: [1/5] ARM: dts: qcom: apq8064: Add missing scm clock

2016-12-28 Thread Andy Gross
On Wed, Dec 21, 2016 at 03:49:35AM -0800, Bjorn Andersson wrote:
> As per the device tree binding the apq8064 scm node requires the core
> clock to be specified, so add this.
> 
> Cc: John Stultz 
> Signed-off-by: Bjorn Andersson 
> ---
>  arch/arm/boot/dts/qcom-apq8064.dtsi | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/arch/arm/boot/dts/qcom-apq8064.dtsi 
> b/arch/arm/boot/dts/qcom-apq8064.dtsi
> index 268bd470c865..78bf155a52f3 100644
> --- a/arch/arm/boot/dts/qcom-apq8064.dtsi
> +++ b/arch/arm/boot/dts/qcom-apq8064.dtsi
> @@ -303,6 +303,9 @@
>   firmware {
>   scm {
>   compatible = "qcom,scm-apq8064";
> +
> + clocks = < CE3_CORE_CLK>;
> + clock-names = "core";

Isn't this supposed to be the DFAB clk?  The RPM one?  I think that's why we let
the clock just fall through optionally before the recent changes that broke
this.

Regards,

Andy


Re: [1/5] ARM: dts: qcom: apq8064: Add missing scm clock

2016-12-28 Thread Andy Gross
On Wed, Dec 21, 2016 at 03:49:35AM -0800, Bjorn Andersson wrote:
> As per the device tree binding the apq8064 scm node requires the core
> clock to be specified, so add this.
> 
> Cc: John Stultz 
> Signed-off-by: Bjorn Andersson 
> ---
>  arch/arm/boot/dts/qcom-apq8064.dtsi | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/arch/arm/boot/dts/qcom-apq8064.dtsi 
> b/arch/arm/boot/dts/qcom-apq8064.dtsi
> index 268bd470c865..78bf155a52f3 100644
> --- a/arch/arm/boot/dts/qcom-apq8064.dtsi
> +++ b/arch/arm/boot/dts/qcom-apq8064.dtsi
> @@ -303,6 +303,9 @@
>   firmware {
>   scm {
>   compatible = "qcom,scm-apq8064";
> +
> + clocks = < CE3_CORE_CLK>;
> + clock-names = "core";

Isn't this supposed to be the DFAB clk?  The RPM one?  I think that's why we let
the clock just fall through optionally before the recent changes that broke
this.

Regards,

Andy


Re: [PATCH] Revert "mmc: dw_mmc-rockchip: add runtime PM support"

2016-12-28 Thread Jaehoon Chung
Hi Randy,

On 12/29/2016 12:34 AM, Randy Li wrote:
> This reverts commit f90142683f04bcb0729bf0df67a5e29562b725b9.
> It is reported that making RK3288 can't boot from eMMC/MMC.

Could you explain in more detail?
As you mentioned, this patch is making that RK3288 can't boot..then why?
Good way should be that finds the main reason and fixes it.
Not just revert.

Best Regards,
Jaehoon Chung

> 
> Signed-off-by: Randy Li 
> ---
>  drivers/mmc/host/dw_mmc-rockchip.c | 41 
> +++---
>  1 file changed, 3 insertions(+), 38 deletions(-)
> 
> diff --git a/drivers/mmc/host/dw_mmc-rockchip.c 
> b/drivers/mmc/host/dw_mmc-rockchip.c
> index 9a46e46..3189234 100644
> --- a/drivers/mmc/host/dw_mmc-rockchip.c
> +++ b/drivers/mmc/host/dw_mmc-rockchip.c
> @@ -14,7 +14,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  
>  #include "dw_mmc.h"
> @@ -327,7 +326,6 @@ static int dw_mci_rockchip_probe(struct platform_device 
> *pdev)
>  {
>   const struct dw_mci_drv_data *drv_data;
>   const struct of_device_id *match;
> - int ret;
>  
>   if (!pdev->dev.of_node)
>   return -ENODEV;
> @@ -335,49 +333,16 @@ static int dw_mci_rockchip_probe(struct platform_device 
> *pdev)
>   match = of_match_node(dw_mci_rockchip_match, pdev->dev.of_node);
>   drv_data = match->data;
>  
> - pm_runtime_get_noresume(>dev);
> - pm_runtime_set_active(>dev);
> - pm_runtime_enable(>dev);
> - pm_runtime_set_autosuspend_delay(>dev, 50);
> - pm_runtime_use_autosuspend(>dev);
> -
> - ret = dw_mci_pltfm_register(pdev, drv_data);
> - if (ret) {
> - pm_runtime_disable(>dev);
> - pm_runtime_set_suspended(>dev);
> - pm_runtime_put_noidle(>dev);
> - return ret;
> - }
> -
> - pm_runtime_put_autosuspend(>dev);
> -
> - return 0;
> + return dw_mci_pltfm_register(pdev, drv_data);
>  }
>  
> -static int dw_mci_rockchip_remove(struct platform_device *pdev)
> -{
> - pm_runtime_get_sync(>dev);
> - pm_runtime_disable(>dev);
> - pm_runtime_put_noidle(>dev);
> -
> - return dw_mci_pltfm_remove(pdev);
> -}
> -
> -static const struct dev_pm_ops dw_mci_rockchip_dev_pm_ops = {
> - SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
> - pm_runtime_force_resume)
> - SET_RUNTIME_PM_OPS(dw_mci_runtime_suspend,
> -dw_mci_runtime_resume,
> -NULL)
> -};
> -
>  static struct platform_driver dw_mci_rockchip_pltfm_driver = {
>   .probe  = dw_mci_rockchip_probe,
> - .remove = dw_mci_rockchip_remove,
> + .remove = dw_mci_pltfm_remove,
>   .driver = {
>   .name   = "dwmmc_rockchip",
>   .of_match_table = dw_mci_rockchip_match,
> - .pm = _mci_rockchip_dev_pm_ops,
> + .pm = _mci_pltfm_pmops,
>   },
>  };
>  
> 



Re: [PATCH] Revert "mmc: dw_mmc-rockchip: add runtime PM support"

2016-12-28 Thread Jaehoon Chung
Hi Randy,

On 12/29/2016 12:34 AM, Randy Li wrote:
> This reverts commit f90142683f04bcb0729bf0df67a5e29562b725b9.
> It is reported that making RK3288 can't boot from eMMC/MMC.

Could you explain in more detail?
As you mentioned, this patch is making that RK3288 can't boot..then why?
Good way should be that finds the main reason and fixes it.
Not just revert.

Best Regards,
Jaehoon Chung

> 
> Signed-off-by: Randy Li 
> ---
>  drivers/mmc/host/dw_mmc-rockchip.c | 41 
> +++---
>  1 file changed, 3 insertions(+), 38 deletions(-)
> 
> diff --git a/drivers/mmc/host/dw_mmc-rockchip.c 
> b/drivers/mmc/host/dw_mmc-rockchip.c
> index 9a46e46..3189234 100644
> --- a/drivers/mmc/host/dw_mmc-rockchip.c
> +++ b/drivers/mmc/host/dw_mmc-rockchip.c
> @@ -14,7 +14,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  
>  #include "dw_mmc.h"
> @@ -327,7 +326,6 @@ static int dw_mci_rockchip_probe(struct platform_device 
> *pdev)
>  {
>   const struct dw_mci_drv_data *drv_data;
>   const struct of_device_id *match;
> - int ret;
>  
>   if (!pdev->dev.of_node)
>   return -ENODEV;
> @@ -335,49 +333,16 @@ static int dw_mci_rockchip_probe(struct platform_device 
> *pdev)
>   match = of_match_node(dw_mci_rockchip_match, pdev->dev.of_node);
>   drv_data = match->data;
>  
> - pm_runtime_get_noresume(>dev);
> - pm_runtime_set_active(>dev);
> - pm_runtime_enable(>dev);
> - pm_runtime_set_autosuspend_delay(>dev, 50);
> - pm_runtime_use_autosuspend(>dev);
> -
> - ret = dw_mci_pltfm_register(pdev, drv_data);
> - if (ret) {
> - pm_runtime_disable(>dev);
> - pm_runtime_set_suspended(>dev);
> - pm_runtime_put_noidle(>dev);
> - return ret;
> - }
> -
> - pm_runtime_put_autosuspend(>dev);
> -
> - return 0;
> + return dw_mci_pltfm_register(pdev, drv_data);
>  }
>  
> -static int dw_mci_rockchip_remove(struct platform_device *pdev)
> -{
> - pm_runtime_get_sync(>dev);
> - pm_runtime_disable(>dev);
> - pm_runtime_put_noidle(>dev);
> -
> - return dw_mci_pltfm_remove(pdev);
> -}
> -
> -static const struct dev_pm_ops dw_mci_rockchip_dev_pm_ops = {
> - SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
> - pm_runtime_force_resume)
> - SET_RUNTIME_PM_OPS(dw_mci_runtime_suspend,
> -dw_mci_runtime_resume,
> -NULL)
> -};
> -
>  static struct platform_driver dw_mci_rockchip_pltfm_driver = {
>   .probe  = dw_mci_rockchip_probe,
> - .remove = dw_mci_rockchip_remove,
> + .remove = dw_mci_pltfm_remove,
>   .driver = {
>   .name   = "dwmmc_rockchip",
>   .of_match_table = dw_mci_rockchip_match,
> - .pm = _mci_rockchip_dev_pm_ops,
> + .pm = _mci_pltfm_pmops,
>   },
>  };
>  
> 



Re: [PATCH 3/6] ubifs: Use 64bit readdir cookies

2016-12-28 Thread J. Bruce Fields
On Thu, Dec 01, 2016 at 11:02:18PM +0100, Richard Weinberger wrote:
> This is the first step to support proper telldir/seekdir()
> in UBIFS.
> Let's report 64bit cookies in readdir(). The cookie is a combination
> of the entry key plus the double hash value.

Would it be possible to explain what that means in a little detail, for
a ubifs-ignoramus?

I'm just curious how it meets the requirements for nfs exports.

--b.

> 
> Signed-off-by: Richard Weinberger 
> ---
>  fs/ubifs/dir.c   | 46 +++
>  fs/ubifs/key.h   | 59 
> 
>  fs/ubifs/ubifs.h |  1 +
>  3 files changed, 94 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c
> index 883b2fdf51df..3b8c08dad75b 100644
> --- a/fs/ubifs/dir.c
> +++ b/fs/ubifs/dir.c
> @@ -539,7 +539,7 @@ static int ubifs_readdir(struct file *file, struct 
> dir_context *ctx)
>  
>   dbg_gen("dir ino %lu, f_pos %#llx", dir->i_ino, ctx->pos);
>  
> - if (ctx->pos > UBIFS_S_KEY_HASH_MASK || ctx->pos == 2)
> + if (ctx->pos == 2)
>   /*
>* The directory was seek'ed to a senseless position or there
>* are no more entries.
> @@ -594,7 +594,7 @@ static int ubifs_readdir(struct file *file, struct 
> dir_context *ctx)
>   goto out;
>   }
>  
> - ctx->pos = key_hash_flash(c, >key);
> + ctx->pos = key_get_dir_pos(c, file, dent);
>   file->private_data = dent;
>   }
>  
> @@ -604,21 +604,43 @@ static int ubifs_readdir(struct file *file, struct 
> dir_context *ctx)
>* The directory was seek'ed to and is now readdir'ed.
>* Find the entry corresponding to @ctx->pos or the closest one.
>*/
> - dent_key_init_hash(c, , dir->i_ino, ctx->pos);
> - fname_len() = 0;
> - dent = ubifs_tnc_next_ent(c, , );
> - if (IS_ERR(dent)) {
> - err = PTR_ERR(dent);
> - goto out;
> + dent_key_init_hash(c, , dir->i_ino,
> +key_get_hash_from_dir_pos(c, file, 
> ctx->pos));
> +
> + if (key_want_short_hash(file)) {
> + err = -ENOENT;
> + } else {
> + dent = kmalloc(UBIFS_MAX_DENT_NODE_SZ, GFP_NOFS);
> + if (!dent) {
> + err = -ENOMEM;
> + goto out;
> + }
> +
> + err = ubifs_tnc_lookup_dh(c, , dent,
> + key_get_cookie_from_dir_pos(c, ctx->pos));
> + }
> + if (err) {
> + kfree(dent);
> +
> + if (err < 0 && err != -ENOENT && err != -EOPNOTSUPP)
> + goto out;
> +
> + fname_len() = 0;
> + dent = ubifs_tnc_next_ent(c, , );
> + if (IS_ERR(dent)) {
> + err = PTR_ERR(dent);
> + goto out;
> + }
>   }
> - ctx->pos = key_hash_flash(c, >key);
> +
> + ctx->pos = key_get_dir_pos(c, file, dent);
>   file->private_data = dent;
>   }
>  
>   while (1) {
> - dbg_gen("feed '%s', ino %llu, new f_pos %#x",
> + dbg_gen("feed '%s', ino %llu, new f_pos %#lx",
>   dent->name, (unsigned long long)le64_to_cpu(dent->inum),
> - key_hash_flash(c, >key));
> + (unsigned long)key_get_dir_pos(c, file, dent));
>   ubifs_assert(le64_to_cpu(dent->ch.sqnum) >
>ubifs_inode(dir)->creat_sqnum);
>  
> @@ -656,7 +678,7 @@ static int ubifs_readdir(struct file *file, struct 
> dir_context *ctx)
>   }
>  
>   kfree(file->private_data);
> - ctx->pos = key_hash_flash(c, >key);
> + ctx->pos = key_get_dir_pos(c, file, dent);
>   file->private_data = dent;
>   cond_resched();
>   }
> diff --git a/fs/ubifs/key.h b/fs/ubifs/key.h
> index 7547be512db2..2788e36ce832 100644
> --- a/fs/ubifs/key.h
> +++ b/fs/ubifs/key.h
> @@ -397,6 +397,65 @@ static inline uint32_t key_hash_flash(const struct 
> ubifs_info *c, const void *k)
>  }
>  
>  /**
> + * key_want_short_hash - tests whether we can emit a 64bit hash or not.
> + * @file: the file handle of the directory
> + */
> +static inline bool key_want_short_hash(struct file *file)
> +{
> + if (file->f_mode & FMODE_32BITHASH)
> + return true;
> +
> + if (!(file->f_mode & FMODE_64BITHASH) && is_32bit_api())
> + return true;
> +
> + return false;
> +}
> +
> +/**
> + * key_dir_pos - compute a 64bit directory cookie for readdir()
> + * @c: UBIFS file-system description object
> + * 

Re: [PATCH 3/6] ubifs: Use 64bit readdir cookies

2016-12-28 Thread J. Bruce Fields
On Thu, Dec 01, 2016 at 11:02:18PM +0100, Richard Weinberger wrote:
> This is the first step to support proper telldir/seekdir()
> in UBIFS.
> Let's report 64bit cookies in readdir(). The cookie is a combination
> of the entry key plus the double hash value.

Would it be possible to explain what that means in a little detail, for
a ubifs-ignoramus?

I'm just curious how it meets the requirements for nfs exports.

--b.

> 
> Signed-off-by: Richard Weinberger 
> ---
>  fs/ubifs/dir.c   | 46 +++
>  fs/ubifs/key.h   | 59 
> 
>  fs/ubifs/ubifs.h |  1 +
>  3 files changed, 94 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c
> index 883b2fdf51df..3b8c08dad75b 100644
> --- a/fs/ubifs/dir.c
> +++ b/fs/ubifs/dir.c
> @@ -539,7 +539,7 @@ static int ubifs_readdir(struct file *file, struct 
> dir_context *ctx)
>  
>   dbg_gen("dir ino %lu, f_pos %#llx", dir->i_ino, ctx->pos);
>  
> - if (ctx->pos > UBIFS_S_KEY_HASH_MASK || ctx->pos == 2)
> + if (ctx->pos == 2)
>   /*
>* The directory was seek'ed to a senseless position or there
>* are no more entries.
> @@ -594,7 +594,7 @@ static int ubifs_readdir(struct file *file, struct 
> dir_context *ctx)
>   goto out;
>   }
>  
> - ctx->pos = key_hash_flash(c, >key);
> + ctx->pos = key_get_dir_pos(c, file, dent);
>   file->private_data = dent;
>   }
>  
> @@ -604,21 +604,43 @@ static int ubifs_readdir(struct file *file, struct 
> dir_context *ctx)
>* The directory was seek'ed to and is now readdir'ed.
>* Find the entry corresponding to @ctx->pos or the closest one.
>*/
> - dent_key_init_hash(c, , dir->i_ino, ctx->pos);
> - fname_len() = 0;
> - dent = ubifs_tnc_next_ent(c, , );
> - if (IS_ERR(dent)) {
> - err = PTR_ERR(dent);
> - goto out;
> + dent_key_init_hash(c, , dir->i_ino,
> +key_get_hash_from_dir_pos(c, file, 
> ctx->pos));
> +
> + if (key_want_short_hash(file)) {
> + err = -ENOENT;
> + } else {
> + dent = kmalloc(UBIFS_MAX_DENT_NODE_SZ, GFP_NOFS);
> + if (!dent) {
> + err = -ENOMEM;
> + goto out;
> + }
> +
> + err = ubifs_tnc_lookup_dh(c, , dent,
> + key_get_cookie_from_dir_pos(c, ctx->pos));
> + }
> + if (err) {
> + kfree(dent);
> +
> + if (err < 0 && err != -ENOENT && err != -EOPNOTSUPP)
> + goto out;
> +
> + fname_len() = 0;
> + dent = ubifs_tnc_next_ent(c, , );
> + if (IS_ERR(dent)) {
> + err = PTR_ERR(dent);
> + goto out;
> + }
>   }
> - ctx->pos = key_hash_flash(c, >key);
> +
> + ctx->pos = key_get_dir_pos(c, file, dent);
>   file->private_data = dent;
>   }
>  
>   while (1) {
> - dbg_gen("feed '%s', ino %llu, new f_pos %#x",
> + dbg_gen("feed '%s', ino %llu, new f_pos %#lx",
>   dent->name, (unsigned long long)le64_to_cpu(dent->inum),
> - key_hash_flash(c, >key));
> + (unsigned long)key_get_dir_pos(c, file, dent));
>   ubifs_assert(le64_to_cpu(dent->ch.sqnum) >
>ubifs_inode(dir)->creat_sqnum);
>  
> @@ -656,7 +678,7 @@ static int ubifs_readdir(struct file *file, struct 
> dir_context *ctx)
>   }
>  
>   kfree(file->private_data);
> - ctx->pos = key_hash_flash(c, >key);
> + ctx->pos = key_get_dir_pos(c, file, dent);
>   file->private_data = dent;
>   cond_resched();
>   }
> diff --git a/fs/ubifs/key.h b/fs/ubifs/key.h
> index 7547be512db2..2788e36ce832 100644
> --- a/fs/ubifs/key.h
> +++ b/fs/ubifs/key.h
> @@ -397,6 +397,65 @@ static inline uint32_t key_hash_flash(const struct 
> ubifs_info *c, const void *k)
>  }
>  
>  /**
> + * key_want_short_hash - tests whether we can emit a 64bit hash or not.
> + * @file: the file handle of the directory
> + */
> +static inline bool key_want_short_hash(struct file *file)
> +{
> + if (file->f_mode & FMODE_32BITHASH)
> + return true;
> +
> + if (!(file->f_mode & FMODE_64BITHASH) && is_32bit_api())
> + return true;
> +
> + return false;
> +}
> +
> +/**
> + * key_dir_pos - compute a 64bit directory cookie for readdir()
> + * @c: UBIFS file-system description object
> + * @file: the file 

Re: [PATCH 6/6] ubifs: Wire up NFS support

2016-12-28 Thread J. Bruce Fields
On Thu, Dec 01, 2016 at 11:02:21PM +0100, Richard Weinberger wrote:
> Since we have 64bit readdir cookies and export operations
> we can finally enable NFS export support for UBIFS.
> 
> Signed-off-by: Richard Weinberger 
> ---
>  fs/ubifs/dir.c   | 9 ++---
>  fs/ubifs/super.c | 3 +++
>  2 files changed, 5 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c
> index 5485d836af21..13d13afd2976 100644
> --- a/fs/ubifs/dir.c
> +++ b/fs/ubifs/dir.c
> @@ -289,11 +289,8 @@ static struct dentry *ubifs_lookup(struct inode *dir, 
> struct dentry *dentry,
>  done:
>   kfree(dent);
>   fscrypt_free_filename();
> - /*
> -  * Note, d_splice_alias() would be required instead if we supported
> -  * NFS.
> -  */
> - d_add(dentry, inode);
> +
> + d_splice_alias(inode, dentry);
>   return NULL;

I'm pretty sure that should be 

return d_splice_alias(inode, dentry);

--b.

>  
>  out_dent:
> @@ -524,8 +521,6 @@ static unsigned int vfs_dent_type(uint8_t type)
>   * properly by means of saving full directory entry name in the private field
>   * of the file description object.
>   *
> - * This means that UBIFS cannot support NFS which requires full
> - * 'seekdir()'/'telldir()' support.
>   */
>  static int ubifs_readdir(struct file *file, struct dir_context *ctx)
>  {
> diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c
> index be5b697d8214..4cb7f641f35c 100644
> --- a/fs/ubifs/super.c
> +++ b/fs/ubifs/super.c
> @@ -2092,6 +2092,9 @@ static int ubifs_fill_super(struct super_block *sb, 
> void *data, int silent)
>   goto out_unlock;
>   }
>  
> + if (c->parent_pointer && c->double_hash)
> + sb->s_export_op = _export_ops;
> +
>   /* Read the root inode */
>   root = ubifs_iget(sb, UBIFS_ROOT_INO);
>   if (IS_ERR(root)) {
> -- 
> 2.7.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/6] ubifs: Wire up NFS support

2016-12-28 Thread J. Bruce Fields
On Thu, Dec 01, 2016 at 11:02:21PM +0100, Richard Weinberger wrote:
> Since we have 64bit readdir cookies and export operations
> we can finally enable NFS export support for UBIFS.
> 
> Signed-off-by: Richard Weinberger 
> ---
>  fs/ubifs/dir.c   | 9 ++---
>  fs/ubifs/super.c | 3 +++
>  2 files changed, 5 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c
> index 5485d836af21..13d13afd2976 100644
> --- a/fs/ubifs/dir.c
> +++ b/fs/ubifs/dir.c
> @@ -289,11 +289,8 @@ static struct dentry *ubifs_lookup(struct inode *dir, 
> struct dentry *dentry,
>  done:
>   kfree(dent);
>   fscrypt_free_filename();
> - /*
> -  * Note, d_splice_alias() would be required instead if we supported
> -  * NFS.
> -  */
> - d_add(dentry, inode);
> +
> + d_splice_alias(inode, dentry);
>   return NULL;

I'm pretty sure that should be 

return d_splice_alias(inode, dentry);

--b.

>  
>  out_dent:
> @@ -524,8 +521,6 @@ static unsigned int vfs_dent_type(uint8_t type)
>   * properly by means of saving full directory entry name in the private field
>   * of the file description object.
>   *
> - * This means that UBIFS cannot support NFS which requires full
> - * 'seekdir()'/'telldir()' support.
>   */
>  static int ubifs_readdir(struct file *file, struct dir_context *ctx)
>  {
> diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c
> index be5b697d8214..4cb7f641f35c 100644
> --- a/fs/ubifs/super.c
> +++ b/fs/ubifs/super.c
> @@ -2092,6 +2092,9 @@ static int ubifs_fill_super(struct super_block *sb, 
> void *data, int silent)
>   goto out_unlock;
>   }
>  
> + if (c->parent_pointer && c->double_hash)
> + sb->s_export_op = _export_ops;
> +
>   /* Read the root inode */
>   root = ubifs_iget(sb, UBIFS_ROOT_INO);
>   if (IS_ERR(root)) {
> -- 
> 2.7.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/4] ocfs2: fix some small problems

2016-12-28 Thread Gang He



>>> 
> Hi Gang, one small comment below:
> 
> On Wed, Dec 21, 2016 at 2:20 AM, Gang He  wrote:
>> First, move setting fe_done = 1 in spin lock, avoid bring
>> any potential race condition. Second, tune mlog message level
>> from ERROR to NOTICE, since the message should not belong to
>> error message.
>>
>> Signed-off-by: Gang He 
>> ---
>>  fs/ocfs2/filecheck.c | 8 
>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> @@ -545,11 +545,11 @@ static ssize_t ocfs2_filecheck_store(struct kobject 
> *kobj,
>> spin_lock(>fs_fcheck->fc_lock);
>> if ((ent->fs_fcheck->fc_size >= ent->fs_fcheck->fc_max) &&
>> (ent->fs_fcheck->fc_done == 0)) {
>> -   mlog(ML_ERROR,
>> +   mlog(ML_NOTICE,
>> "Cannot do more file check "
>> "since file check queue(%u) is full now\n",
>> ent->fs_fcheck->fc_max);
>> -   ret = -EBUSY;
>> +   ret = -EAGAIN;
> 
> This change wasn't described in the patch header. Granted, from the
> message above the change, -EAGAIN certainly seems a more reasonable
> return value but it would be good to know whether this was intended
> and why.
Hello Mark, thank for your comments, I will add the description for this change 
in V3.
Do you have any other comments for the other patches in v2?

Thanks
Gang

> 
> Thanks,
>--Mark



Re: [PATCH v2 2/4] ocfs2: fix some small problems

2016-12-28 Thread Gang He



>>> 
> Hi Gang, one small comment below:
> 
> On Wed, Dec 21, 2016 at 2:20 AM, Gang He  wrote:
>> First, move setting fe_done = 1 in spin lock, avoid bring
>> any potential race condition. Second, tune mlog message level
>> from ERROR to NOTICE, since the message should not belong to
>> error message.
>>
>> Signed-off-by: Gang He 
>> ---
>>  fs/ocfs2/filecheck.c | 8 
>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> @@ -545,11 +545,11 @@ static ssize_t ocfs2_filecheck_store(struct kobject 
> *kobj,
>> spin_lock(>fs_fcheck->fc_lock);
>> if ((ent->fs_fcheck->fc_size >= ent->fs_fcheck->fc_max) &&
>> (ent->fs_fcheck->fc_done == 0)) {
>> -   mlog(ML_ERROR,
>> +   mlog(ML_NOTICE,
>> "Cannot do more file check "
>> "since file check queue(%u) is full now\n",
>> ent->fs_fcheck->fc_max);
>> -   ret = -EBUSY;
>> +   ret = -EAGAIN;
> 
> This change wasn't described in the patch header. Granted, from the
> message above the change, -EAGAIN certainly seems a more reasonable
> return value but it would be good to know whether this was intended
> and why.
Hello Mark, thank for your comments, I will add the description for this change 
in V3.
Do you have any other comments for the other patches in v2?

Thanks
Gang

> 
> Thanks,
>--Mark



Re: [RFC, PATCHv2 29/29] mm, x86: introduce RLIMIT_VADDR

2016-12-28 Thread Carlos O'Donell
On 12/26/2016 09:24 PM, Kirill A. Shutemov wrote:
> On Mon, Dec 26, 2016 at 06:06:01PM -0800, Andy Lutomirski wrote:
>> On Mon, Dec 26, 2016 at 5:54 PM, Kirill A. Shutemov
>>  wrote:
>>> This patch introduces new rlimit resource to manage maximum virtual
>>> address available to userspace to map.
>>>
>>> On x86, 5-level paging enables 56-bit userspace virtual address space.
>>> Not all user space is ready to handle wide addresses. It's known that
>>> at least some JIT compilers use high bit in pointers to encode their
>>> information. It collides with valid pointers with 5-level paging and
>>> leads to crashes.
>>>
>>> The patch aims to address this compatibility issue.
>>>
>>> MM would use min(RLIMIT_VADDR, TASK_SIZE) as upper limit of virtual
>>> address available to map by userspace.
>>>
>>> The default hard limit will be RLIM_INFINITY, which basically means that
>>> TASK_SIZE limits available address space.
>>>
>>> The soft limit will also be RLIM_INFINITY everywhere, but the machine
>>> with 5-level paging enabled. In this case, soft limit would be
>>> (1UL << 47) - PAGE_SIZE. It’s current x86-64 TASK_SIZE_MAX with 4-level
>>> paging which known to be safe
>>>
>>> New rlimit resource would follow usual semantics with regards to
>>> inheritance: preserved on fork(2) and exec(2). This has potential to
>>> break application if limits set too wide or too narrow, but this is not
>>> uncommon for other resources (consider RLIMIT_DATA or RLIMIT_AS).
>>>
>>> As with other resources you can set the limit lower than current usage.
>>> It would affect only future virtual address space allocations.
>>>
>>> Use-cases for new rlimit:
>>>
>>>   - Bumping the soft limit to RLIM_INFINITY, allows current process all
>>> its children to use addresses above 47-bits.
>>>
>>>   - Bumping the soft limit to RLIM_INFINITY after fork(2), but before
>>> exec(2) allows the child to use addresses above 47-bits.
>>>
>>>   - Lowering the hard limit to 47-bits would prevent current process all
>>> its children to use addresses above 47-bits, unless a process has
>>> CAP_SYS_RESOURCES.
>>>
>>>   - It’s also can be handy to lower hard or soft limit to arbitrary
>>> address. User-mode emulation in QEMU may lower the limit to 32-bit
>>> to emulate 32-bit machine on 64-bit host.
>>
>> I tend to think that this should be a personality or an ELF flag, not
>> an rlimit.
> 
> My plan was to implement ELF flag on top. Basically, ELF flag would mean
> that we bump soft limit to hard limit on exec.

Could you clarify what you mean by an "ELF flag?"

-- 
Cheers,
Carlos.


Re: [RFC, PATCHv2 29/29] mm, x86: introduce RLIMIT_VADDR

2016-12-28 Thread Carlos O'Donell
On 12/26/2016 09:24 PM, Kirill A. Shutemov wrote:
> On Mon, Dec 26, 2016 at 06:06:01PM -0800, Andy Lutomirski wrote:
>> On Mon, Dec 26, 2016 at 5:54 PM, Kirill A. Shutemov
>>  wrote:
>>> This patch introduces new rlimit resource to manage maximum virtual
>>> address available to userspace to map.
>>>
>>> On x86, 5-level paging enables 56-bit userspace virtual address space.
>>> Not all user space is ready to handle wide addresses. It's known that
>>> at least some JIT compilers use high bit in pointers to encode their
>>> information. It collides with valid pointers with 5-level paging and
>>> leads to crashes.
>>>
>>> The patch aims to address this compatibility issue.
>>>
>>> MM would use min(RLIMIT_VADDR, TASK_SIZE) as upper limit of virtual
>>> address available to map by userspace.
>>>
>>> The default hard limit will be RLIM_INFINITY, which basically means that
>>> TASK_SIZE limits available address space.
>>>
>>> The soft limit will also be RLIM_INFINITY everywhere, but the machine
>>> with 5-level paging enabled. In this case, soft limit would be
>>> (1UL << 47) - PAGE_SIZE. It’s current x86-64 TASK_SIZE_MAX with 4-level
>>> paging which known to be safe
>>>
>>> New rlimit resource would follow usual semantics with regards to
>>> inheritance: preserved on fork(2) and exec(2). This has potential to
>>> break application if limits set too wide or too narrow, but this is not
>>> uncommon for other resources (consider RLIMIT_DATA or RLIMIT_AS).
>>>
>>> As with other resources you can set the limit lower than current usage.
>>> It would affect only future virtual address space allocations.
>>>
>>> Use-cases for new rlimit:
>>>
>>>   - Bumping the soft limit to RLIM_INFINITY, allows current process all
>>> its children to use addresses above 47-bits.
>>>
>>>   - Bumping the soft limit to RLIM_INFINITY after fork(2), but before
>>> exec(2) allows the child to use addresses above 47-bits.
>>>
>>>   - Lowering the hard limit to 47-bits would prevent current process all
>>> its children to use addresses above 47-bits, unless a process has
>>> CAP_SYS_RESOURCES.
>>>
>>>   - It’s also can be handy to lower hard or soft limit to arbitrary
>>> address. User-mode emulation in QEMU may lower the limit to 32-bit
>>> to emulate 32-bit machine on 64-bit host.
>>
>> I tend to think that this should be a personality or an ELF flag, not
>> an rlimit.
> 
> My plan was to implement ELF flag on top. Basically, ELF flag would mean
> that we bump soft limit to hard limit on exec.

Could you clarify what you mean by an "ELF flag?"

-- 
Cheers,
Carlos.


[PATCH v5 3/4] clk: rockchip: add new pll-type for rk3328

2016-12-28 Thread Elaine Zhang
The rk3328's pll and clock are similar with rk3036's,
it different with pll_mode_mask, the rk3328 soc
pll mode only one bit(rk3036 soc have two bits)
so these should be independent and separate from
the series of rk3328s.

Changes in v4:
  adjust the pacth 3 and 4 order.
  move pll_rk3328 to patch 3.
Changes in v3:
  fix up the pll type pll_rk3328 description and use

Signed-off-by: Elaine Zhang 
---
 drivers/clk/rockchip/clk-pll.c | 16 +---
 drivers/clk/rockchip/clk.h |  1 +
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/clk/rockchip/clk-pll.c b/drivers/clk/rockchip/clk-pll.c
index 6ed605776abd..eec51893a7e6 100644
--- a/drivers/clk/rockchip/clk-pll.c
+++ b/drivers/clk/rockchip/clk-pll.c
@@ -29,6 +29,7 @@
 #define PLL_MODE_SLOW  0x0
 #define PLL_MODE_NORM  0x1
 #define PLL_MODE_DEEP  0x2
+#define PLL_RK3328_MODE_MASK   0x1
 
 struct rockchip_clk_pll {
struct clk_hw   hw;
@@ -848,7 +849,8 @@ struct clk *rockchip_clk_register_pll(struct 
rockchip_clk_provider *ctx,
struct clk *pll_clk, *mux_clk;
char pll_name[20];
 
-   if (num_parents != 2) {
+   if ((pll_type != pll_rk3328 && num_parents != 2) ||
+   (pll_type == pll_rk3328 && num_parents != 1)) {
pr_err("%s: needs two parent clocks\n", __func__);
return ERR_PTR(-EINVAL);
}
@@ -865,13 +867,17 @@ struct clk *rockchip_clk_register_pll(struct 
rockchip_clk_provider *ctx,
pll_mux = >pll_mux;
pll_mux->reg = ctx->reg_base + mode_offset;
pll_mux->shift = mode_shift;
-   pll_mux->mask = PLL_MODE_MASK;
+   if (pll_type == pll_rk3328)
+   pll_mux->mask = PLL_RK3328_MODE_MASK;
+   else
+   pll_mux->mask = PLL_MODE_MASK;
pll_mux->flags = 0;
pll_mux->lock = >lock;
pll_mux->hw.init = 
 
if (pll_type == pll_rk3036 ||
pll_type == pll_rk3066 ||
+   pll_type == pll_rk3328 ||
pll_type == pll_rk3399)
pll_mux->flags |= CLK_MUX_HIWORD_MASK;
 
@@ -884,7 +890,10 @@ struct clk *rockchip_clk_register_pll(struct 
rockchip_clk_provider *ctx,
init.flags = CLK_SET_RATE_PARENT;
init.ops = pll->pll_mux_ops;
init.parent_names = pll_parents;
-   init.num_parents = ARRAY_SIZE(pll_parents);
+   if (pll_type == pll_rk3328)
+   init.num_parents = 2;
+   else
+   init.num_parents = ARRAY_SIZE(pll_parents);
 
mux_clk = clk_register(NULL, _mux->hw);
if (IS_ERR(mux_clk))
@@ -918,6 +927,7 @@ struct clk *rockchip_clk_register_pll(struct 
rockchip_clk_provider *ctx,
 
switch (pll_type) {
case pll_rk3036:
+   case pll_rk3328:
if (!pll->rate_table || IS_ERR(ctx->grf))
init.ops = _rk3036_pll_clk_norate_ops;
else
diff --git a/drivers/clk/rockchip/clk.h b/drivers/clk/rockchip/clk.h
index d67eecc4ade9..06acb7e0911f 100644
--- a/drivers/clk/rockchip/clk.h
+++ b/drivers/clk/rockchip/clk.h
@@ -130,6 +130,7 @@
 enum rockchip_pll_type {
pll_rk3036,
pll_rk3066,
+   pll_rk3328,
pll_rk3399,
 };
 
-- 
1.9.1




[PATCH v5 2/4] dt-bindings: add bindings for rk3328 clock controller

2016-12-28 Thread Elaine Zhang
Add devicetree bindings for Rockchip cru which found on
Rockchip SoCs.

Changes in v4:
  dropping the "rockchip,cru" and "syscon" properties for bindings of rk3328

Signed-off-by: Elaine Zhang 
---
 .../bindings/clock/rockchip,rk3328-cru.txt | 57 ++
 1 file changed, 57 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/clock/rockchip,rk3328-cru.txt

diff --git a/Documentation/devicetree/bindings/clock/rockchip,rk3328-cru.txt 
b/Documentation/devicetree/bindings/clock/rockchip,rk3328-cru.txt
new file mode 100644
index ..e71c675ba5da
--- /dev/null
+++ b/Documentation/devicetree/bindings/clock/rockchip,rk3328-cru.txt
@@ -0,0 +1,57 @@
+* Rockchip RK3328 Clock and Reset Unit
+
+The RK3328 clock controller generates and supplies clock to various
+controllers within the SoC and also implements a reset controller for SoC
+peripherals.
+
+Required Properties:
+
+- compatible: should be "rockchip,rk3328-cru"
+- reg: physical base address of the controller and length of memory mapped
+  region.
+- #clock-cells: should be 1.
+- #reset-cells: should be 1.
+
+Optional Properties:
+
+- rockchip,grf: phandle to the syscon managing the "general register files"
+  If missing pll rates are not changeable, due to the missing pll lock status.
+
+Each clock is assigned an identifier and client nodes can use this identifier
+to specify the clock which they consume. All available clocks are defined as
+preprocessor macros in the dt-bindings/clock/rk3328-cru.h headers and can be
+used in device tree sources. Similar macros exist for the reset sources in
+these files.
+
+External clocks:
+
+There are several clocks that are generated outside the SoC. It is expected
+that they are defined using standard clock bindings with following
+clock-output-names:
+ - "xin24m" - crystal input - required,
+ - "clkin_i2s" - external I2S clock - optional,
+ - "gmac_clkin" - external GMAC clock - optional
+ - "phy_50m_out" - output clock of the pll in the mac phy
+
+Example: Clock controller node:
+
+   cru: clock-controller@ff44 {
+   compatible = "rockchip,rk3328-cru";
+   reg = <0x0 0xff44 0x0 0x1000>;
+   rockchip,grf = <>;
+
+   #clock-cells = <1>;
+   #reset-cells = <1>;
+   };
+
+Example: UART controller node that consumes the clock generated by the clock
+  controller:
+
+   uart0: serial@ff12 {
+   compatible = "snps,dw-apb-uart";
+   reg = <0xff12 0x100>;
+   interrupts = ;
+   reg-shift = <2>;
+   reg-io-width = <4>;
+   clocks = < SCLK_UART0>;
+   };
-- 
1.9.1




[PATCH v5 3/4] clk: rockchip: add new pll-type for rk3328

2016-12-28 Thread Elaine Zhang
The rk3328's pll and clock are similar with rk3036's,
it different with pll_mode_mask, the rk3328 soc
pll mode only one bit(rk3036 soc have two bits)
so these should be independent and separate from
the series of rk3328s.

Changes in v4:
  adjust the pacth 3 and 4 order.
  move pll_rk3328 to patch 3.
Changes in v3:
  fix up the pll type pll_rk3328 description and use

Signed-off-by: Elaine Zhang 
---
 drivers/clk/rockchip/clk-pll.c | 16 +---
 drivers/clk/rockchip/clk.h |  1 +
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/clk/rockchip/clk-pll.c b/drivers/clk/rockchip/clk-pll.c
index 6ed605776abd..eec51893a7e6 100644
--- a/drivers/clk/rockchip/clk-pll.c
+++ b/drivers/clk/rockchip/clk-pll.c
@@ -29,6 +29,7 @@
 #define PLL_MODE_SLOW  0x0
 #define PLL_MODE_NORM  0x1
 #define PLL_MODE_DEEP  0x2
+#define PLL_RK3328_MODE_MASK   0x1
 
 struct rockchip_clk_pll {
struct clk_hw   hw;
@@ -848,7 +849,8 @@ struct clk *rockchip_clk_register_pll(struct 
rockchip_clk_provider *ctx,
struct clk *pll_clk, *mux_clk;
char pll_name[20];
 
-   if (num_parents != 2) {
+   if ((pll_type != pll_rk3328 && num_parents != 2) ||
+   (pll_type == pll_rk3328 && num_parents != 1)) {
pr_err("%s: needs two parent clocks\n", __func__);
return ERR_PTR(-EINVAL);
}
@@ -865,13 +867,17 @@ struct clk *rockchip_clk_register_pll(struct 
rockchip_clk_provider *ctx,
pll_mux = >pll_mux;
pll_mux->reg = ctx->reg_base + mode_offset;
pll_mux->shift = mode_shift;
-   pll_mux->mask = PLL_MODE_MASK;
+   if (pll_type == pll_rk3328)
+   pll_mux->mask = PLL_RK3328_MODE_MASK;
+   else
+   pll_mux->mask = PLL_MODE_MASK;
pll_mux->flags = 0;
pll_mux->lock = >lock;
pll_mux->hw.init = 
 
if (pll_type == pll_rk3036 ||
pll_type == pll_rk3066 ||
+   pll_type == pll_rk3328 ||
pll_type == pll_rk3399)
pll_mux->flags |= CLK_MUX_HIWORD_MASK;
 
@@ -884,7 +890,10 @@ struct clk *rockchip_clk_register_pll(struct 
rockchip_clk_provider *ctx,
init.flags = CLK_SET_RATE_PARENT;
init.ops = pll->pll_mux_ops;
init.parent_names = pll_parents;
-   init.num_parents = ARRAY_SIZE(pll_parents);
+   if (pll_type == pll_rk3328)
+   init.num_parents = 2;
+   else
+   init.num_parents = ARRAY_SIZE(pll_parents);
 
mux_clk = clk_register(NULL, _mux->hw);
if (IS_ERR(mux_clk))
@@ -918,6 +927,7 @@ struct clk *rockchip_clk_register_pll(struct 
rockchip_clk_provider *ctx,
 
switch (pll_type) {
case pll_rk3036:
+   case pll_rk3328:
if (!pll->rate_table || IS_ERR(ctx->grf))
init.ops = _rk3036_pll_clk_norate_ops;
else
diff --git a/drivers/clk/rockchip/clk.h b/drivers/clk/rockchip/clk.h
index d67eecc4ade9..06acb7e0911f 100644
--- a/drivers/clk/rockchip/clk.h
+++ b/drivers/clk/rockchip/clk.h
@@ -130,6 +130,7 @@
 enum rockchip_pll_type {
pll_rk3036,
pll_rk3066,
+   pll_rk3328,
pll_rk3399,
 };
 
-- 
1.9.1




[PATCH v5 2/4] dt-bindings: add bindings for rk3328 clock controller

2016-12-28 Thread Elaine Zhang
Add devicetree bindings for Rockchip cru which found on
Rockchip SoCs.

Changes in v4:
  dropping the "rockchip,cru" and "syscon" properties for bindings of rk3328

Signed-off-by: Elaine Zhang 
---
 .../bindings/clock/rockchip,rk3328-cru.txt | 57 ++
 1 file changed, 57 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/clock/rockchip,rk3328-cru.txt

diff --git a/Documentation/devicetree/bindings/clock/rockchip,rk3328-cru.txt 
b/Documentation/devicetree/bindings/clock/rockchip,rk3328-cru.txt
new file mode 100644
index ..e71c675ba5da
--- /dev/null
+++ b/Documentation/devicetree/bindings/clock/rockchip,rk3328-cru.txt
@@ -0,0 +1,57 @@
+* Rockchip RK3328 Clock and Reset Unit
+
+The RK3328 clock controller generates and supplies clock to various
+controllers within the SoC and also implements a reset controller for SoC
+peripherals.
+
+Required Properties:
+
+- compatible: should be "rockchip,rk3328-cru"
+- reg: physical base address of the controller and length of memory mapped
+  region.
+- #clock-cells: should be 1.
+- #reset-cells: should be 1.
+
+Optional Properties:
+
+- rockchip,grf: phandle to the syscon managing the "general register files"
+  If missing pll rates are not changeable, due to the missing pll lock status.
+
+Each clock is assigned an identifier and client nodes can use this identifier
+to specify the clock which they consume. All available clocks are defined as
+preprocessor macros in the dt-bindings/clock/rk3328-cru.h headers and can be
+used in device tree sources. Similar macros exist for the reset sources in
+these files.
+
+External clocks:
+
+There are several clocks that are generated outside the SoC. It is expected
+that they are defined using standard clock bindings with following
+clock-output-names:
+ - "xin24m" - crystal input - required,
+ - "clkin_i2s" - external I2S clock - optional,
+ - "gmac_clkin" - external GMAC clock - optional
+ - "phy_50m_out" - output clock of the pll in the mac phy
+
+Example: Clock controller node:
+
+   cru: clock-controller@ff44 {
+   compatible = "rockchip,rk3328-cru";
+   reg = <0x0 0xff44 0x0 0x1000>;
+   rockchip,grf = <>;
+
+   #clock-cells = <1>;
+   #reset-cells = <1>;
+   };
+
+Example: UART controller node that consumes the clock generated by the clock
+  controller:
+
+   uart0: serial@ff12 {
+   compatible = "snps,dw-apb-uart";
+   reg = <0xff12 0x100>;
+   interrupts = ;
+   reg-shift = <2>;
+   reg-io-width = <4>;
+   clocks = < SCLK_UART0>;
+   };
-- 
1.9.1




[PATCH v5 1/4] clk: rockchip: add dt-binding header for rk3328

2016-12-28 Thread Elaine Zhang
Add the dt-bindings header for the rk3328, that gets shared between
the clock controller and the clock references in the dts.
Add softreset ID for rk3328.

Signed-off-by: Elaine Zhang 
---
 include/dt-bindings/clock/rk3328-cru.h | 403 +
 1 file changed, 403 insertions(+)
 create mode 100644 include/dt-bindings/clock/rk3328-cru.h

diff --git a/include/dt-bindings/clock/rk3328-cru.h 
b/include/dt-bindings/clock/rk3328-cru.h
new file mode 100644
index ..545ed7541316
--- /dev/null
+++ b/include/dt-bindings/clock/rk3328-cru.h
@@ -0,0 +1,403 @@
+/*
+ * Copyright (c) 2016 Rockchip Electronics Co. Ltd.
+ * Author: Elaine 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _DT_BINDINGS_CLK_ROCKCHIP_RK3328_H
+#define _DT_BINDINGS_CLK_ROCKCHIP_RK3328_H
+
+/* core clocks */
+#define PLL_APLL   1
+#define PLL_DPLL   2
+#define PLL_CPLL   3
+#define PLL_GPLL   4
+#define PLL_NPLL   5
+#define ARMCLK 6
+
+/* sclk gates (special clocks) */
+#define SCLK_RTC32K30
+#define SCLK_SDMMC_EXT 31
+#define SCLK_SPI   32
+#define SCLK_SDMMC 33
+#define SCLK_SDIO  34
+#define SCLK_EMMC  35
+#define SCLK_TSADC 36
+#define SCLK_SARADC37
+#define SCLK_UART0 38
+#define SCLK_UART1 39
+#define SCLK_UART2 40
+#define SCLK_I2S0  41
+#define SCLK_I2S1  42
+#define SCLK_I2S2  43
+#define SCLK_I2S1_OUT  44
+#define SCLK_I2S2_OUT  45
+#define SCLK_SPDIF 46
+#define SCLK_TIMER047
+#define SCLK_TIMER148
+#define SCLK_TIMER249
+#define SCLK_TIMER350
+#define SCLK_TIMER451
+#define SCLK_TIMER552
+#define SCLK_WIFI  53
+#define SCLK_CIF_OUT   54
+#define SCLK_I2C0  55
+#define SCLK_I2C1  56
+#define SCLK_I2C2  57
+#define SCLK_I2C3  58
+#define SCLK_CRYPTO59
+#define SCLK_PWM   60
+#define SCLK_PDM   61
+#define SCLK_EFUSE 62
+#define SCLK_OTP   63
+#define SCLK_DDRCLK64
+#define SCLK_VDEC_CABAC65
+#define SCLK_VDEC_CORE 66
+#define SCLK_VENC_DSP  67
+#define SCLK_VENC_CORE 68
+#define SCLK_RGA   69
+#define SCLK_HDMI_SFC  70
+#define SCLK_HDMI_CEC  71
+#define SCLK_USB3_REF  72
+#define SCLK_USB3_SUSPEND  73
+#define SCLK_SDMMC_DRV 74
+#define SCLK_SDIO_DRV  75
+#define SCLK_EMMC_DRV  76
+#define SCLK_SDMMC_EXT_DRV 77
+#define SCLK_SDMMC_SAMPLE  78
+#define SCLK_SDIO_SAMPLE   79
+#define SCLK_EMMC_SAMPLE   80
+#define SCLK_SDMMC_EXT_SAMPLE  81
+#define SCLK_VOP   82
+#define SCLK_MAC2PHY_RXTX  83
+#define SCLK_MAC2PHY_SRC   84
+#define SCLK_MAC2PHY_REF   85
+#define SCLK_MAC2PHY_OUT   86
+#define SCLK_MAC2IO_RX 87
+#define SCLK_MAC2IO_TX 88
+#define SCLK_MAC2IO_REFOUT 89
+#define SCLK_MAC2IO_REF90
+#define SCLK_MAC2IO_OUT91
+#define SCLK_TSP   92
+#define SCLK_HSADC_TSP 93
+#define SCLK_USB3PHY_REF   94
+#define SCLK_REF_USB3OTG   95
+#define SCLK_USB3OTG_REF   96
+#define SCLK_USB3OTG_SUSPEND   97
+#define SCLK_REF_USB3OTG_SRC   98
+#define SCLK_MAC2IO_SRC99
+
+/* dclk gates */
+#define DCLK_LCDC  180
+#define DCLK_HDMIPHY   181
+#define HDMIPHY182
+#define USB480M183
+#define DCLK_LCDC_SRC  184
+
+/* aclk gates */
+#define ACLK_AXISRAM   190
+#define ACLK_VOP_PRE   191
+#define ACLK_USB3OTG   192
+#define ACLK_RGA_PRE   193
+#define ACLK_DMAC  194
+#define ACLK_GPU   195
+#define ACLK_BUS_PRE   196
+#define ACLK_PERI_PRE  197
+#define ACLK_RKVDEC_PRE198
+#define ACLK_RKVDEC199
+#define ACLK_RKVENC200
+#define ACLK_VPU_PRE   201
+#define ACLK_VIO_PRE   202
+#define ACLK_VPU   203
+#define ACLK_VIO   204
+#define ACLK_VOP   205
+#define ACLK_GMAC  206
+#define ACLK_H265  207
+#define ACLK_H264  208

[PATCH v5 4/4] clk: rockchip: add clock controller for rk3328

2016-12-28 Thread Elaine Zhang
Add the clock tree definition for the new rk3328 SoC.

Changes in v5:
  fix up some code style, remove grf clk init and cru dump.
Changes in v4:
  adjust the pacth 3 and 4 order.
Changes in v3:
  fix up the pll parent only xin24m.
Changes in v2:
  fix up these *_sample error description.

Signed-off-by: Elaine Zhang 
---
 drivers/clk/rockchip/Makefile |   1 +
 drivers/clk/rockchip/clk-rk3328.c | 896 ++
 drivers/clk/rockchip/clk.h|  18 +
 3 files changed, 915 insertions(+)
 create mode 100644 drivers/clk/rockchip/clk-rk3328.c

diff --git a/drivers/clk/rockchip/Makefile b/drivers/clk/rockchip/Makefile
index 16e098c36f90..68b04bfca282 100644
--- a/drivers/clk/rockchip/Makefile
+++ b/drivers/clk/rockchip/Makefile
@@ -16,5 +16,6 @@ obj-y += clk-rk3036.o
 obj-y  += clk-rk3188.o
 obj-y  += clk-rk3228.o
 obj-y  += clk-rk3288.o
+obj-y  += clk-rk3328.o
 obj-y  += clk-rk3368.o
 obj-y  += clk-rk3399.o
diff --git a/drivers/clk/rockchip/clk-rk3328.c 
b/drivers/clk/rockchip/clk-rk3328.c
new file mode 100644
index ..f486ec9e9471
--- /dev/null
+++ b/drivers/clk/rockchip/clk-rk3328.c
@@ -0,0 +1,896 @@
+/*
+ * Copyright (c) 2016 Rockchip Electronics Co. Ltd.
+ * Author: Elaine 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "clk.h"
+
+#define RK3328_GRF_SOC_STATUS0 0x480
+#define RK3328_GRF_MAC_CON10x904
+#define RK3328_GRF_MAC_CON20x908
+
+enum rk3328_plls {
+   apll, dpll, cpll, gpll, npll,
+};
+
+static struct rockchip_pll_rate_table rk3328_pll_rates[] = {
+   /* _mhz, _refdiv, _fbdiv, _postdiv1, _postdiv2, _dsmpd, _frac */
+   RK3036_PLL_RATE(160800, 1, 67, 1, 1, 1, 0),
+   RK3036_PLL_RATE(158400, 1, 66, 1, 1, 1, 0),
+   RK3036_PLL_RATE(156000, 1, 65, 1, 1, 1, 0),
+   RK3036_PLL_RATE(153600, 1, 64, 1, 1, 1, 0),
+   RK3036_PLL_RATE(151200, 1, 63, 1, 1, 1, 0),
+   RK3036_PLL_RATE(148800, 1, 62, 1, 1, 1, 0),
+   RK3036_PLL_RATE(146400, 1, 61, 1, 1, 1, 0),
+   RK3036_PLL_RATE(144000, 1, 60, 1, 1, 1, 0),
+   RK3036_PLL_RATE(141600, 1, 59, 1, 1, 1, 0),
+   RK3036_PLL_RATE(139200, 1, 58, 1, 1, 1, 0),
+   RK3036_PLL_RATE(136800, 1, 57, 1, 1, 1, 0),
+   RK3036_PLL_RATE(134400, 1, 56, 1, 1, 1, 0),
+   RK3036_PLL_RATE(132000, 1, 55, 1, 1, 1, 0),
+   RK3036_PLL_RATE(129600, 1, 54, 1, 1, 1, 0),
+   RK3036_PLL_RATE(127200, 1, 53, 1, 1, 1, 0),
+   RK3036_PLL_RATE(124800, 1, 52, 1, 1, 1, 0),
+   RK3036_PLL_RATE(12, 1, 50, 1, 1, 1, 0),
+   RK3036_PLL_RATE(118800, 2, 99, 1, 1, 1, 0),
+   RK3036_PLL_RATE(110400, 1, 46, 1, 1, 1, 0),
+   RK3036_PLL_RATE(11, 12, 550, 1, 1, 1, 0),
+   RK3036_PLL_RATE(100800, 1, 84, 2, 1, 1, 0),
+   RK3036_PLL_RATE(10, 6, 500, 2, 1, 1, 0),
+   RK3036_PLL_RATE(98400, 1, 82, 2, 1, 1, 0),
+   RK3036_PLL_RATE(96000, 1, 80, 2, 1, 1, 0),
+   RK3036_PLL_RATE(93600, 1, 78, 2, 1, 1, 0),
+   RK3036_PLL_RATE(91200, 1, 76, 2, 1, 1, 0),
+   RK3036_PLL_RATE(9, 4, 300, 2, 1, 1, 0),
+   RK3036_PLL_RATE(88800, 1, 74, 2, 1, 1, 0),
+   RK3036_PLL_RATE(86400, 1, 72, 2, 1, 1, 0),
+   RK3036_PLL_RATE(84000, 1, 70, 2, 1, 1, 0),
+   RK3036_PLL_RATE(81600, 1, 68, 2, 1, 1, 0),
+   RK3036_PLL_RATE(8, 6, 400, 2, 1, 1, 0),
+   RK3036_PLL_RATE(7, 6, 350, 2, 1, 1, 0),
+   RK3036_PLL_RATE(69600, 1, 58, 2, 1, 1, 0),
+   RK3036_PLL_RATE(6, 1, 75, 3, 1, 1, 0),
+   RK3036_PLL_RATE(59400, 2, 99, 2, 1, 1, 0),
+   RK3036_PLL_RATE(50400, 1, 63, 3, 1, 1, 0),
+   RK3036_PLL_RATE(5, 6, 250, 2, 1, 1, 0),
+   RK3036_PLL_RATE(40800, 1, 68, 2, 2, 1, 0),
+   RK3036_PLL_RATE(31200, 1, 52, 2, 2, 1, 0),
+   RK3036_PLL_RATE(21600, 1, 72, 4, 2, 1, 0),
+   RK3036_PLL_RATE(9600, 1, 64, 4, 4, 1, 0),
+   { /* sentinel */ },
+};
+
+static struct rockchip_pll_rate_table rk3328_pll_frac_rates[] = {
+   /* _mhz, _refdiv, _fbdiv, _postdiv1, _postdiv2, _dsmpd, _frac */
+   RK3036_PLL_RATE(1016064000, 3, 127, 1, 1, 0, 134217),
+   /* vco = 1016064000 */
+   RK3036_PLL_RATE(98304, 24, 983, 1, 1, 0, 671088),
+   /* vco = 98304 */
+   RK3036_PLL_RATE(49152, 24, 983, 2, 1, 0, 671088),
+   /* vco 

[PATCH v5 0/4] clk: rockchip: support clk controller for rk3328 SoC

2016-12-28 Thread Elaine Zhang
Changes in v5:
  fix up some code style, remove grf clk init and cru dump.
Changes in v4:
  dropping the "rockchip,cru" and "syscon" properties for bindings of rk3328
  adjust the pacth 3 and 4 order.
  move pll_rk3328 to patch 3.
Changes in v3:
  fix up the pll type pll_rk3328 description and use.
Changes in v2:
  add bindings for rk3328 clock controller

Elaine Zhang (4):
  clk: rockchip: add dt-binding header for rk3328
  dt-bindings: add bindings for rk3328 clock controller
  clk: rockchip: add new pll-type for rk3328
  clk: rockchip: add clock controller for rk3328

 .../bindings/clock/rockchip,rk3328-cru.txt |  57 ++
 drivers/clk/rockchip/Makefile  |   1 +
 drivers/clk/rockchip/clk-pll.c |  16 +-
 drivers/clk/rockchip/clk-rk3328.c  | 896 +
 drivers/clk/rockchip/clk.h |  19 +
 include/dt-bindings/clock/rk3328-cru.h | 403 +
 6 files changed, 1389 insertions(+), 3 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/clock/rockchip,rk3328-cru.txt
 create mode 100644 drivers/clk/rockchip/clk-rk3328.c
 create mode 100644 include/dt-bindings/clock/rk3328-cru.h

-- 
1.9.1




[PATCH v5 0/4] clk: rockchip: support clk controller for rk3328 SoC

2016-12-28 Thread Elaine Zhang
Changes in v5:
  fix up some code style, remove grf clk init and cru dump.
Changes in v4:
  dropping the "rockchip,cru" and "syscon" properties for bindings of rk3328
  adjust the pacth 3 and 4 order.
  move pll_rk3328 to patch 3.
Changes in v3:
  fix up the pll type pll_rk3328 description and use.
Changes in v2:
  add bindings for rk3328 clock controller

Elaine Zhang (4):
  clk: rockchip: add dt-binding header for rk3328
  dt-bindings: add bindings for rk3328 clock controller
  clk: rockchip: add new pll-type for rk3328
  clk: rockchip: add clock controller for rk3328

 .../bindings/clock/rockchip,rk3328-cru.txt |  57 ++
 drivers/clk/rockchip/Makefile  |   1 +
 drivers/clk/rockchip/clk-pll.c |  16 +-
 drivers/clk/rockchip/clk-rk3328.c  | 896 +
 drivers/clk/rockchip/clk.h |  19 +
 include/dt-bindings/clock/rk3328-cru.h | 403 +
 6 files changed, 1389 insertions(+), 3 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/clock/rockchip,rk3328-cru.txt
 create mode 100644 drivers/clk/rockchip/clk-rk3328.c
 create mode 100644 include/dt-bindings/clock/rk3328-cru.h

-- 
1.9.1




[PATCH v5 1/4] clk: rockchip: add dt-binding header for rk3328

2016-12-28 Thread Elaine Zhang
Add the dt-bindings header for the rk3328, that gets shared between
the clock controller and the clock references in the dts.
Add softreset ID for rk3328.

Signed-off-by: Elaine Zhang 
---
 include/dt-bindings/clock/rk3328-cru.h | 403 +
 1 file changed, 403 insertions(+)
 create mode 100644 include/dt-bindings/clock/rk3328-cru.h

diff --git a/include/dt-bindings/clock/rk3328-cru.h 
b/include/dt-bindings/clock/rk3328-cru.h
new file mode 100644
index ..545ed7541316
--- /dev/null
+++ b/include/dt-bindings/clock/rk3328-cru.h
@@ -0,0 +1,403 @@
+/*
+ * Copyright (c) 2016 Rockchip Electronics Co. Ltd.
+ * Author: Elaine 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _DT_BINDINGS_CLK_ROCKCHIP_RK3328_H
+#define _DT_BINDINGS_CLK_ROCKCHIP_RK3328_H
+
+/* core clocks */
+#define PLL_APLL   1
+#define PLL_DPLL   2
+#define PLL_CPLL   3
+#define PLL_GPLL   4
+#define PLL_NPLL   5
+#define ARMCLK 6
+
+/* sclk gates (special clocks) */
+#define SCLK_RTC32K30
+#define SCLK_SDMMC_EXT 31
+#define SCLK_SPI   32
+#define SCLK_SDMMC 33
+#define SCLK_SDIO  34
+#define SCLK_EMMC  35
+#define SCLK_TSADC 36
+#define SCLK_SARADC37
+#define SCLK_UART0 38
+#define SCLK_UART1 39
+#define SCLK_UART2 40
+#define SCLK_I2S0  41
+#define SCLK_I2S1  42
+#define SCLK_I2S2  43
+#define SCLK_I2S1_OUT  44
+#define SCLK_I2S2_OUT  45
+#define SCLK_SPDIF 46
+#define SCLK_TIMER047
+#define SCLK_TIMER148
+#define SCLK_TIMER249
+#define SCLK_TIMER350
+#define SCLK_TIMER451
+#define SCLK_TIMER552
+#define SCLK_WIFI  53
+#define SCLK_CIF_OUT   54
+#define SCLK_I2C0  55
+#define SCLK_I2C1  56
+#define SCLK_I2C2  57
+#define SCLK_I2C3  58
+#define SCLK_CRYPTO59
+#define SCLK_PWM   60
+#define SCLK_PDM   61
+#define SCLK_EFUSE 62
+#define SCLK_OTP   63
+#define SCLK_DDRCLK64
+#define SCLK_VDEC_CABAC65
+#define SCLK_VDEC_CORE 66
+#define SCLK_VENC_DSP  67
+#define SCLK_VENC_CORE 68
+#define SCLK_RGA   69
+#define SCLK_HDMI_SFC  70
+#define SCLK_HDMI_CEC  71
+#define SCLK_USB3_REF  72
+#define SCLK_USB3_SUSPEND  73
+#define SCLK_SDMMC_DRV 74
+#define SCLK_SDIO_DRV  75
+#define SCLK_EMMC_DRV  76
+#define SCLK_SDMMC_EXT_DRV 77
+#define SCLK_SDMMC_SAMPLE  78
+#define SCLK_SDIO_SAMPLE   79
+#define SCLK_EMMC_SAMPLE   80
+#define SCLK_SDMMC_EXT_SAMPLE  81
+#define SCLK_VOP   82
+#define SCLK_MAC2PHY_RXTX  83
+#define SCLK_MAC2PHY_SRC   84
+#define SCLK_MAC2PHY_REF   85
+#define SCLK_MAC2PHY_OUT   86
+#define SCLK_MAC2IO_RX 87
+#define SCLK_MAC2IO_TX 88
+#define SCLK_MAC2IO_REFOUT 89
+#define SCLK_MAC2IO_REF90
+#define SCLK_MAC2IO_OUT91
+#define SCLK_TSP   92
+#define SCLK_HSADC_TSP 93
+#define SCLK_USB3PHY_REF   94
+#define SCLK_REF_USB3OTG   95
+#define SCLK_USB3OTG_REF   96
+#define SCLK_USB3OTG_SUSPEND   97
+#define SCLK_REF_USB3OTG_SRC   98
+#define SCLK_MAC2IO_SRC99
+
+/* dclk gates */
+#define DCLK_LCDC  180
+#define DCLK_HDMIPHY   181
+#define HDMIPHY182
+#define USB480M183
+#define DCLK_LCDC_SRC  184
+
+/* aclk gates */
+#define ACLK_AXISRAM   190
+#define ACLK_VOP_PRE   191
+#define ACLK_USB3OTG   192
+#define ACLK_RGA_PRE   193
+#define ACLK_DMAC  194
+#define ACLK_GPU   195
+#define ACLK_BUS_PRE   196
+#define ACLK_PERI_PRE  197
+#define ACLK_RKVDEC_PRE198
+#define ACLK_RKVDEC199
+#define ACLK_RKVENC200
+#define ACLK_VPU_PRE   201
+#define ACLK_VIO_PRE   202
+#define ACLK_VPU   203
+#define ACLK_VIO   204
+#define ACLK_VOP   205
+#define ACLK_GMAC  206
+#define ACLK_H265  207
+#define ACLK_H264  208
+#define ACLK_MAC2PHY   209
+#define 

[PATCH v5 4/4] clk: rockchip: add clock controller for rk3328

2016-12-28 Thread Elaine Zhang
Add the clock tree definition for the new rk3328 SoC.

Changes in v5:
  fix up some code style, remove grf clk init and cru dump.
Changes in v4:
  adjust the pacth 3 and 4 order.
Changes in v3:
  fix up the pll parent only xin24m.
Changes in v2:
  fix up these *_sample error description.

Signed-off-by: Elaine Zhang 
---
 drivers/clk/rockchip/Makefile |   1 +
 drivers/clk/rockchip/clk-rk3328.c | 896 ++
 drivers/clk/rockchip/clk.h|  18 +
 3 files changed, 915 insertions(+)
 create mode 100644 drivers/clk/rockchip/clk-rk3328.c

diff --git a/drivers/clk/rockchip/Makefile b/drivers/clk/rockchip/Makefile
index 16e098c36f90..68b04bfca282 100644
--- a/drivers/clk/rockchip/Makefile
+++ b/drivers/clk/rockchip/Makefile
@@ -16,5 +16,6 @@ obj-y += clk-rk3036.o
 obj-y  += clk-rk3188.o
 obj-y  += clk-rk3228.o
 obj-y  += clk-rk3288.o
+obj-y  += clk-rk3328.o
 obj-y  += clk-rk3368.o
 obj-y  += clk-rk3399.o
diff --git a/drivers/clk/rockchip/clk-rk3328.c 
b/drivers/clk/rockchip/clk-rk3328.c
new file mode 100644
index ..f486ec9e9471
--- /dev/null
+++ b/drivers/clk/rockchip/clk-rk3328.c
@@ -0,0 +1,896 @@
+/*
+ * Copyright (c) 2016 Rockchip Electronics Co. Ltd.
+ * Author: Elaine 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "clk.h"
+
+#define RK3328_GRF_SOC_STATUS0 0x480
+#define RK3328_GRF_MAC_CON10x904
+#define RK3328_GRF_MAC_CON20x908
+
+enum rk3328_plls {
+   apll, dpll, cpll, gpll, npll,
+};
+
+static struct rockchip_pll_rate_table rk3328_pll_rates[] = {
+   /* _mhz, _refdiv, _fbdiv, _postdiv1, _postdiv2, _dsmpd, _frac */
+   RK3036_PLL_RATE(160800, 1, 67, 1, 1, 1, 0),
+   RK3036_PLL_RATE(158400, 1, 66, 1, 1, 1, 0),
+   RK3036_PLL_RATE(156000, 1, 65, 1, 1, 1, 0),
+   RK3036_PLL_RATE(153600, 1, 64, 1, 1, 1, 0),
+   RK3036_PLL_RATE(151200, 1, 63, 1, 1, 1, 0),
+   RK3036_PLL_RATE(148800, 1, 62, 1, 1, 1, 0),
+   RK3036_PLL_RATE(146400, 1, 61, 1, 1, 1, 0),
+   RK3036_PLL_RATE(144000, 1, 60, 1, 1, 1, 0),
+   RK3036_PLL_RATE(141600, 1, 59, 1, 1, 1, 0),
+   RK3036_PLL_RATE(139200, 1, 58, 1, 1, 1, 0),
+   RK3036_PLL_RATE(136800, 1, 57, 1, 1, 1, 0),
+   RK3036_PLL_RATE(134400, 1, 56, 1, 1, 1, 0),
+   RK3036_PLL_RATE(132000, 1, 55, 1, 1, 1, 0),
+   RK3036_PLL_RATE(129600, 1, 54, 1, 1, 1, 0),
+   RK3036_PLL_RATE(127200, 1, 53, 1, 1, 1, 0),
+   RK3036_PLL_RATE(124800, 1, 52, 1, 1, 1, 0),
+   RK3036_PLL_RATE(12, 1, 50, 1, 1, 1, 0),
+   RK3036_PLL_RATE(118800, 2, 99, 1, 1, 1, 0),
+   RK3036_PLL_RATE(110400, 1, 46, 1, 1, 1, 0),
+   RK3036_PLL_RATE(11, 12, 550, 1, 1, 1, 0),
+   RK3036_PLL_RATE(100800, 1, 84, 2, 1, 1, 0),
+   RK3036_PLL_RATE(10, 6, 500, 2, 1, 1, 0),
+   RK3036_PLL_RATE(98400, 1, 82, 2, 1, 1, 0),
+   RK3036_PLL_RATE(96000, 1, 80, 2, 1, 1, 0),
+   RK3036_PLL_RATE(93600, 1, 78, 2, 1, 1, 0),
+   RK3036_PLL_RATE(91200, 1, 76, 2, 1, 1, 0),
+   RK3036_PLL_RATE(9, 4, 300, 2, 1, 1, 0),
+   RK3036_PLL_RATE(88800, 1, 74, 2, 1, 1, 0),
+   RK3036_PLL_RATE(86400, 1, 72, 2, 1, 1, 0),
+   RK3036_PLL_RATE(84000, 1, 70, 2, 1, 1, 0),
+   RK3036_PLL_RATE(81600, 1, 68, 2, 1, 1, 0),
+   RK3036_PLL_RATE(8, 6, 400, 2, 1, 1, 0),
+   RK3036_PLL_RATE(7, 6, 350, 2, 1, 1, 0),
+   RK3036_PLL_RATE(69600, 1, 58, 2, 1, 1, 0),
+   RK3036_PLL_RATE(6, 1, 75, 3, 1, 1, 0),
+   RK3036_PLL_RATE(59400, 2, 99, 2, 1, 1, 0),
+   RK3036_PLL_RATE(50400, 1, 63, 3, 1, 1, 0),
+   RK3036_PLL_RATE(5, 6, 250, 2, 1, 1, 0),
+   RK3036_PLL_RATE(40800, 1, 68, 2, 2, 1, 0),
+   RK3036_PLL_RATE(31200, 1, 52, 2, 2, 1, 0),
+   RK3036_PLL_RATE(21600, 1, 72, 4, 2, 1, 0),
+   RK3036_PLL_RATE(9600, 1, 64, 4, 4, 1, 0),
+   { /* sentinel */ },
+};
+
+static struct rockchip_pll_rate_table rk3328_pll_frac_rates[] = {
+   /* _mhz, _refdiv, _fbdiv, _postdiv1, _postdiv2, _dsmpd, _frac */
+   RK3036_PLL_RATE(1016064000, 3, 127, 1, 1, 0, 134217),
+   /* vco = 1016064000 */
+   RK3036_PLL_RATE(98304, 24, 983, 1, 1, 0, 671088),
+   /* vco = 98304 */
+   RK3036_PLL_RATE(49152, 24, 983, 2, 1, 0, 671088),
+   /* vco = 98304 */
+   RK3036_PLL_RATE(6144, 6, 

[PATCH] mm: Drop "PFNs busy" printk in an expected path.

2016-12-28 Thread Eric Anholt
For CMA allocations, we expect to occasionally hit this error path, at
which point CMA will retry.  Given that, we shouldn't be spamming
dmesg about it.

The Raspberry Pi graphics driver does frequent CMA allocations, and
during regression testing this printk was sometimes occurring 100s of
times per second.

Signed-off-by: Eric Anholt 
Cc: linux-stable 
---
 mm/page_alloc.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6de9440e3ae2..bea7204c14a5 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7289,8 +7289,6 @@ int alloc_contig_range(unsigned long start, unsigned long 
end,
 
/* Make sure the range is really isolated. */
if (test_pages_isolated(outer_start, end, false)) {
-   pr_info("%s: [%lx, %lx) PFNs busy\n",
-   __func__, outer_start, end);
ret = -EBUSY;
goto done;
}
-- 
2.11.0



[PATCH] mm: Drop "PFNs busy" printk in an expected path.

2016-12-28 Thread Eric Anholt
For CMA allocations, we expect to occasionally hit this error path, at
which point CMA will retry.  Given that, we shouldn't be spamming
dmesg about it.

The Raspberry Pi graphics driver does frequent CMA allocations, and
during regression testing this printk was sometimes occurring 100s of
times per second.

Signed-off-by: Eric Anholt 
Cc: linux-stable 
---
 mm/page_alloc.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6de9440e3ae2..bea7204c14a5 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7289,8 +7289,6 @@ int alloc_contig_range(unsigned long start, unsigned long 
end,
 
/* Make sure the range is really isolated. */
if (test_pages_isolated(outer_start, end, false)) {
-   pr_info("%s: [%lx, %lx) PFNs busy\n",
-   __func__, outer_start, end);
ret = -EBUSY;
goto done;
}
-- 
2.11.0



[PATCH] mm: cma: print allocation failure reason and bitmap status

2016-12-28 Thread Jaewon Kim
There are many reasons of CMA allocation failure such as EBUSY, ENOMEM, EINTR.
This patch prints the error value and bitmap status to know available pages
regarding fragmentation.

This is an ENOMEM example with this patch.
[   11.616321]  [2:   Binder:711_1:  740] cma: cma_alloc: alloc failed, 
req-size: 256 pages, ret: -12
[   11.616365]  [2:   Binder:711_1:  740] number of available pages: 
4+7+7+8+38+166+127=>357 pages, total: 2048 pages

Signed-off-by: Jaewon Kim 
---
 mm/cma.c | 29 -
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/mm/cma.c b/mm/cma.c
index c960459..535aa39 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -369,7 +369,7 @@ struct page *cma_alloc(struct cma *cma, size_t count, 
unsigned int align)
unsigned long start = 0;
unsigned long bitmap_maxno, bitmap_no, bitmap_count;
struct page *page = NULL;
-   int ret;
+   int ret = -ENOMEM;
 
if (!cma || !cma->count)
return NULL;
@@ -427,6 +427,33 @@ struct page *cma_alloc(struct cma *cma, size_t count, 
unsigned int align)
trace_cma_alloc(pfn, page, count, align);
 
pr_debug("%s(): returned %p\n", __func__, page);
+
+   if (ret != 0) {
+   unsigned int nr, nr_total = 0;
+   unsigned long next_set_bit;
+
+   pr_info("%s: alloc failed, req-size: %zu pages, ret: %d\n",
+   __func__, count, ret);
+   mutex_lock(>lock);
+   printk("number of available pages: ");
+   start = 0;
+   for (;;) {
+   bitmap_no = find_next_zero_bit(cma->bitmap, cma->count, 
start);
+   next_set_bit = find_next_bit(cma->bitmap, cma->count, 
bitmap_no);
+   nr = next_set_bit - bitmap_no;
+   if (bitmap_no >= cma->count)
+   break;
+   if (nr_total == 0)
+   printk("%u", nr);
+   else
+   printk("+%u", nr);
+   nr_total += nr;
+   start = bitmap_no + nr;
+   }
+   printk("=>%u pages, total: %lu pages\n", nr_total, cma->count);
+   mutex_unlock(>lock);
+   }
+
return page;
 }
 
-- 
1.9.1



[PATCH] mm: cma: print allocation failure reason and bitmap status

2016-12-28 Thread Jaewon Kim
There are many reasons of CMA allocation failure such as EBUSY, ENOMEM, EINTR.
This patch prints the error value and bitmap status to know available pages
regarding fragmentation.

This is an ENOMEM example with this patch.
[   11.616321]  [2:   Binder:711_1:  740] cma: cma_alloc: alloc failed, 
req-size: 256 pages, ret: -12
[   11.616365]  [2:   Binder:711_1:  740] number of available pages: 
4+7+7+8+38+166+127=>357 pages, total: 2048 pages

Signed-off-by: Jaewon Kim 
---
 mm/cma.c | 29 -
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/mm/cma.c b/mm/cma.c
index c960459..535aa39 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -369,7 +369,7 @@ struct page *cma_alloc(struct cma *cma, size_t count, 
unsigned int align)
unsigned long start = 0;
unsigned long bitmap_maxno, bitmap_no, bitmap_count;
struct page *page = NULL;
-   int ret;
+   int ret = -ENOMEM;
 
if (!cma || !cma->count)
return NULL;
@@ -427,6 +427,33 @@ struct page *cma_alloc(struct cma *cma, size_t count, 
unsigned int align)
trace_cma_alloc(pfn, page, count, align);
 
pr_debug("%s(): returned %p\n", __func__, page);
+
+   if (ret != 0) {
+   unsigned int nr, nr_total = 0;
+   unsigned long next_set_bit;
+
+   pr_info("%s: alloc failed, req-size: %zu pages, ret: %d\n",
+   __func__, count, ret);
+   mutex_lock(>lock);
+   printk("number of available pages: ");
+   start = 0;
+   for (;;) {
+   bitmap_no = find_next_zero_bit(cma->bitmap, cma->count, 
start);
+   next_set_bit = find_next_bit(cma->bitmap, cma->count, 
bitmap_no);
+   nr = next_set_bit - bitmap_no;
+   if (bitmap_no >= cma->count)
+   break;
+   if (nr_total == 0)
+   printk("%u", nr);
+   else
+   printk("+%u", nr);
+   nr_total += nr;
+   start = bitmap_no + nr;
+   }
+   printk("=>%u pages, total: %lu pages\n", nr_total, cma->count);
+   mutex_unlock(>lock);
+   }
+
return page;
 }
 
-- 
1.9.1



  1   2   3   4   5   6   7   >