Re: [PATCH][RFC] mm: warning message for vm_map_ram about vm size

2014-03-09 Thread Minchan Kim
Hi Giho,

On Mon, Mar 10, 2014 at 01:57:07PM +0900, Gioh Kim wrote:
> Hi,
> 
> I have a failure of allocation of virtual memory on ARMv7 based platform.
> 
> I called alloc_page()/vm_map_ram() for allocation/mapping pages.
> Virtual memory space exhausting problem occurred.
> I checked virtual memory space and found that there are too many 4MB chunks.
> 
> I thought that if just one page in the 4MB chunk lives long, 
> the entire chunk cannot be freed. Therefore new chunk is created again and 
> again.
> 
> In my opinion, the vm_map_ram() function should be used for temporary mapping
> and/or short term memory mapping. Otherwise virtual memory is wasted.
> 
> I am not sure if my opinion is correct. If it is, please add some warning 
> message
> about the vm_map_ram().
> 
> 
> 
> ---8<---
> 
> Subject: [PATCH] mm: warning comment for vm_map_ram
> 
> vm_map_ram can occur locking of virtual memory space
> because if only one page lives long in one vmap_block,
> it takes 4MB (1024-times more than one page) space.

For clarification, vm_map_ram has fragment problem because it
couldn't purge a chunk(ie, 4M address space) if there is a pinning
object in that addresss space so it could consume all VMALLOC
address space easily.

We can fix the fragementaion problem with using vmap instead of
vm_map_ram but it wouldn't a good solution because vmap is much
slower than vm_map_ram for VMAP_MAX_ALLOC below. In my x86 machine,
vm_map_ram is 5 times faster than vmap.

AFAICR, some proprietary GPU driver uses that function heavily so
performance would be really important so I want to stick to use
vm_map_ram.

Another option is that caller should separate long-life and short-life
object and use vmap for long-life but vm_map_ram for short-life.
But it's not a good solution because it's hard for allocator layer
to detect it that how customer lives with the object.

So I thought to fix that problem with revert [1] and adding more
logic to solve fragmentation problem and make bitmap search
operation more efficient by caching the hole. It might handle
fragmentation at the moment but it would make more IPI storm for
TLB flushing as time goes by so that it would mitigate API itself
so using for only temporal object is too limited but it's best at the
moment. I am supporting your opinion.

Let's add some notice message to user.

[1] [3fcd76e8028, mm/vmalloc.c: remove dead code in vb_alloc]

> 
> Change-Id: I6f5919848cf03788b5846b7d850d66e4d93ac39a
> Signed-off-by: Gioh Kim 
> ---
>  mm/vmalloc.c |4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 0fdf968..2de1d1b 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -1083,6 +1083,10 @@ EXPORT_SYMBOL(vm_unmap_ram);
>   * @node: prefer to allocate data structures on this node
>   * @prot: memory protection to use. PAGE_KERNEL for regular RAM
>   *
> + * This function should be used for TEMPORARY mapping. If just one page 
> lives i
> + * long, it would occupy 4MB vm size permamently. 100 pages (just 400KB) 
> could
> + * takes 400MB with bad luck.
> + *

If you use this function for below VMAP_MAX_ALLOC pages, it could be faster
than vmap so it's good but if you mix long-life and short-life object
with vm_map_ram, it could consume lots of address space by fragmentation(
expecially, 32bit machine) so you could see failure in the end.
So, please use this function for short-life object.

>   * Returns: a pointer to the address that has been mapped, or %NULL on 
> failure
>   */
>  void *vm_map_ram(struct page **pages, unsigned int count, int node, pgprot_t 
> prot)
> --
> 1.7.9.5
> 
> Gioh Kim / 김 기 오
> Research Engineer
> Advanced OS Technology Team
> Software Platform R Lab.
> Mobile: 82-10-7322-5548  
> E-mail: gioh@lge.com 
> 19, Yangjae-daero 11gil
> Seocho-gu, Seoul 137-130, Korea
> 
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/5] f2fs: optimize restore_node_summary slightly

2014-03-09 Thread Gu Zheng
Hi Kim,
On 03/10/2014 12:45 PM, Jaegeuk Kim wrote:

> Hi Gu,
> 
> 2014-03-07 (금), 18:43 +0800, Gu Zheng:
>> Previously, we ra_sum_pages to pre-read contiguous pages as more
>> as possible, and if we fail to alloc more pages, an ENOMEM error
>> will be reported upstream, even though we have alloced some pages
>> yet. In fact, we can use the available pages to do the job partly,
>> and continue the rest in the following circle. Only reporting ENOMEM
>> upstream if we really can not alloc any available page.
>>
>> And another fix is ignoring dealing with the following pages if an
>> EIO occurs when reading page from page_list.
>>
>> Signed-off-by: Gu Zheng 
>> ---
>>  fs/f2fs/node.c|   44 
>>  fs/f2fs/segment.c |7 +--
>>  2 files changed, 25 insertions(+), 26 deletions(-)
>>
>> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
>> index 8787469..4b7861d 100644
>> --- a/fs/f2fs/node.c
>> +++ b/fs/f2fs/node.c
>> @@ -1588,15 +1588,8 @@ static int ra_sum_pages(struct f2fs_sb_info *sbi, 
>> struct list_head *pages,
>>  for (; page_idx < start + nrpages; page_idx++) {
>>  /* alloc temporal page for read node summary info*/
>>  page = alloc_page(GFP_F2FS_ZERO);
>> -if (!page) {
>> -struct page *tmp;
>> -list_for_each_entry_safe(page, tmp, pages, lru) {
>> -list_del(>lru);
>> -unlock_page(page);
>> -__free_pages(page, 0);
>> -}
>> -return -ENOMEM;
>> -}
>> +if (!page)
>> +break;
>>  
>>  lock_page(page);
>>  page->index = page_idx;
>> @@ -1607,7 +1600,8 @@ static int ra_sum_pages(struct f2fs_sb_info *sbi, 
>> struct list_head *pages,
>>  f2fs_submit_page_mbio(sbi, page, page->index, );
>>  
>>  f2fs_submit_merged_bio(sbi, META, READ);
>> -return 0;
>> +
>> +return page_idx - start;
>>  }
>>  
>>  int restore_node_summary(struct f2fs_sb_info *sbi,
>> @@ -1630,28 +1624,30 @@ int restore_node_summary(struct f2fs_sb_info *sbi,
>>  nrpages = min(last_offset - i, bio_blocks);
>>  
>>  /* read ahead node pages */
>> -err = ra_sum_pages(sbi, _list, addr, nrpages);
>> -if (err)
>> -return err;
>> +nrpages = ra_sum_pages(sbi, _list, addr, nrpages);
>> +if (!nrpages)
>> +return -ENOMEM;
>>  
>>  list_for_each_entry_safe(page, tmp, _list, lru) {
>> -
> 
> Here we can just add:
>   if (err)
>   goto skip;
>   lock_page();
>   ...
>   unlock_page();
>   skip:
>   list_del();
>   __free_pages();
> 
> IMO, it's more neat, so if you have any objection, let me know.
> Otherwise, I'll handle this by myself. :)

Thanks very much.

Regards,
Gu

> Thanks,
> 
>> -lock_page(page);
>> -if (unlikely(!PageUptodate(page))) {
>> -err = -EIO;
>> -} else {
>> -rn = F2FS_NODE(page);
>> -sum_entry->nid = rn->footer.nid;
>> -sum_entry->version = 0;
>> -sum_entry->ofs_in_node = 0;
>> -sum_entry++;
>> +if (!err) {
>> +lock_page(page);
>> +if (unlikely(!PageUptodate(page))) {
>> +err = -EIO;
>> +} else {
>> +rn = F2FS_NODE(page);
>> +sum_entry->nid = rn->footer.nid;
>> +sum_entry->version = 0;
>> +sum_entry->ofs_in_node = 0;
>> +sum_entry++;
>> +}
>> +unlock_page(page);
>>  }
>>  
>>  list_del(>lru);
>> -unlock_page(page);
>>  __free_pages(page, 0);
>>  }
>>  }
>> +
>>  return err;
>>  }
>>  
>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
>> index 199c964..b3f8431 100644
>> --- a/fs/f2fs/segment.c
>> +++ b/fs/f2fs/segment.c
>> @@ -1160,9 +1160,12 @@ static int read_normal_summaries(struct f2fs_sb_info 
>> *sbi, int type)
>>  ns->ofs_in_node = 0;
>>  }
>>  } else {
>> -if (restore_node_summary(sbi, segno, sum)) {
>> +int err;
>> +
>> +err = restore_node_summary(sbi, segno, sum);
>> +if (err) {
>>  

Re: [PATCH 5/5] f2fs: add a wait queue to avoid unnecessary, build_free_nid

2014-03-09 Thread Gu Zheng
Hi Kim,
On 03/10/2014 12:50 PM, Jaegeuk Kim wrote:

> Hi Gu,
> 
> 2014-03-07 (금), 18:43 +0800, Gu Zheng:
>> Previously, when we try to alloc free nid while the build free nid
>> is going, the allocer will be run into the flow that waiting for
>> "nm_i->build_lock", see following:
>>  /* We should not use stale free nids created by build_free_nids */
>> >if (nm_i->fcnt && !on_build_free_nids(nm_i)) {
>>  f2fs_bug_on(list_empty(_i->free_nid_list));
>>  list_for_each(this, _i->free_nid_list) {
>>  i = list_entry(this, struct free_nid, list);
>>  if (i->state == NID_NEW)
>>  break;
>>  }
>>
>>  f2fs_bug_on(i->state != NID_NEW);
>>  *nid = i->nid;
>>  i->state = NID_ALLOC;
>>  nm_i->fcnt--;
>>  spin_unlock(_i->free_nid_list_lock);
>>  return true;
>>  }
>>  spin_unlock(_i->free_nid_list_lock);
>>
>>  /* Let's scan nat pages and its caches to get free nids */
>> >mutex_lock(_i->build_lock);
>>  build_free_nids(sbi);
>>  mutex_unlock(_i->build_lock);
>> and this will cause another unnecessary building free nid if the current
>> building free nid job is done.
> 
> Could you support any performance number for this?

I just run some common test via fio with simulated ssd(via loop).

> Since, IMO, the contended building processes will be released right away
> because of the following condition check inside build_free_nids().
> 
> if (nm_i->fcnt > NAT_ENTRY_PER_BLOCK)
>   return;

It does. But, IMO, we can not promise nm_i->fcnt > NAT_ENTRY_PER_BLOCK when the
contended building process entering, especially in high concurrency condition.

> 
> So, I don't think this gives us any high latency.
> Can the wakeup_all() become another overhead all the time?

Yeah, maybe we must test whether it can also cause the performance regression,
because the wakeup_all also introduce overhand as you said.
But what is bad is that I do not have a production environment to test it, as 
you
know the simulated environment is not strict.

cc Yu,
Could you please help to test it?

Regards,
Gu

> Thanks,
> 
>> So here we introduce a wait_queue to avoid this issue.
>>
>> Signed-off-by: Gu Zheng 
>> ---
>>  fs/f2fs/f2fs.h |1 +
>>  fs/f2fs/node.c |   10 +-
>>  2 files changed, 10 insertions(+), 1 deletions(-)
>>
>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>> index f845e92..7ae193e 100644
>> --- a/fs/f2fs/f2fs.h
>> +++ b/fs/f2fs/f2fs.h
>> @@ -256,6 +256,7 @@ struct f2fs_nm_info {
>>  spinlock_t free_nid_list_lock;  /* protect free nid list */
>>  unsigned int fcnt;  /* the number of free node id */
>>  struct mutex build_lock;/* lock for build free nids */
>> +wait_queue_head_t build_wq; /* wait queue for build free nids */
>>  
>>  /* for checkpoint */
>>  char *nat_bitmap;   /* NAT bitmap pointer */
>> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
>> index 4b7861d..ab44711 100644
>> --- a/fs/f2fs/node.c
>> +++ b/fs/f2fs/node.c
>> @@ -1422,7 +1422,13 @@ retry:
>>  spin_lock(_i->free_nid_list_lock);
>>  
>>  /* We should not use stale free nids created by build_free_nids */
>> -if (nm_i->fcnt && !on_build_free_nids(nm_i)) {
>> +if (on_build_free_nids(nm_i)) {
>> +spin_unlock(_i->free_nid_list_lock);
>> +wait_event(nm_i->build_wq, !on_build_free_nids(nm_i));
>> +goto retry;
>> +}
>> +
>> +if (nm_i->fcnt) {
>>  f2fs_bug_on(list_empty(_i->free_nid_list));
>>  list_for_each(this, _i->free_nid_list) {
>>  i = list_entry(this, struct free_nid, list);
>> @@ -1443,6 +1449,7 @@ retry:
>>  mutex_lock(_i->build_lock);
>>  build_free_nids(sbi);
>>  mutex_unlock(_i->build_lock);
>> +wake_up_all(_i->build_wq);
>>  goto retry;
>>  }
>>  
>> @@ -1813,6 +1820,7 @@ static int init_node_manager(struct f2fs_sb_info *sbi)
>>  INIT_LIST_HEAD(_i->dirty_nat_entries);
>>  
>>  mutex_init(_i->build_lock);
>> +init_waitqueue_head(_i->build_wq);
>>  spin_lock_init(_i->free_nid_list_lock);
>>  rwlock_init(_i->nat_tree_lock);
>>  
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [f2fs-dev] [PATCH 4/5] f2fs: optimize restore_node_summary slightly

2014-03-09 Thread Jaegeuk Kim
Hi,

2014-03-10 (월), 13:13 +0800, Chao Yu:
> Hi Gu, Kim:
> 
> One more comment.
> 
> > -Original Message-
> > From: Jaegeuk Kim [mailto:jaegeuk@samsung.com]
> > Sent: Monday, March 10, 2014 12:46 PM
> > To: Gu Zheng
> > Cc: linux-kernel; f2fs
> > Subject: Re: [f2fs-dev] [PATCH 4/5] f2fs: optimize restore_node_summary 
> > slightly
> > 
> > Hi Gu,
> > 
> > 2014-03-07 (금), 18:43 +0800, Gu Zheng:
> > > Previously, we ra_sum_pages to pre-read contiguous pages as more
> > > as possible, and if we fail to alloc more pages, an ENOMEM error
> > > will be reported upstream, even though we have alloced some pages
> > > yet. In fact, we can use the available pages to do the job partly,
> > > and continue the rest in the following circle. Only reporting ENOMEM
> > > upstream if we really can not alloc any available page.
> > >
> > > And another fix is ignoring dealing with the following pages if an
> > > EIO occurs when reading page from page_list.
> > >
> > > Signed-off-by: Gu Zheng 
> 
> Reviewed-by: Chao Yu 
> 
> > > ---
> > >  fs/f2fs/node.c|   44 
> > >  fs/f2fs/segment.c |7 +--
> > >  2 files changed, 25 insertions(+), 26 deletions(-)
> > >
> > > diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> > > index 8787469..4b7861d 100644
> > > --- a/fs/f2fs/node.c
> > > +++ b/fs/f2fs/node.c
> > > @@ -1588,15 +1588,8 @@ static int ra_sum_pages(struct f2fs_sb_info *sbi, 
> > > struct list_head
> > *pages,
> > >   for (; page_idx < start + nrpages; page_idx++) {
> > >   /* alloc temporal page for read node summary info*/
> > >   page = alloc_page(GFP_F2FS_ZERO);
> > > - if (!page) {
> > > - struct page *tmp;
> > > - list_for_each_entry_safe(page, tmp, pages, lru) {
> > > - list_del(>lru);
> > > - unlock_page(page);
> > > - __free_pages(page, 0);
> > > - }
> > > - return -ENOMEM;
> > > - }
> > > + if (!page)
> > > + break;
> > >
> > >   lock_page(page);
> > >   page->index = page_idx;
> > > @@ -1607,7 +1600,8 @@ static int ra_sum_pages(struct f2fs_sb_info *sbi, 
> > > struct list_head
> > *pages,
> > >   f2fs_submit_page_mbio(sbi, page, page->index, );
> > >
> > >   f2fs_submit_merged_bio(sbi, META, READ);
> > > - return 0;
> > > +
> > > + return page_idx - start;
> > >  }
> > >
> > >  int restore_node_summary(struct f2fs_sb_info *sbi,
> > > @@ -1630,28 +1624,30 @@ int restore_node_summary(struct f2fs_sb_info *sbi,
> > >   nrpages = min(last_offset - i, bio_blocks);
> > >
> > >   /* read ahead node pages */
> > > - err = ra_sum_pages(sbi, _list, addr, nrpages);
> > > - if (err)
> > > - return err;
> > > + nrpages = ra_sum_pages(sbi, _list, addr, nrpages);
> > > + if (!nrpages)
> > > + return -ENOMEM;
> > >
> > >   list_for_each_entry_safe(page, tmp, _list, lru) {
> > > -
> > 
> > Here we can just add:
> > if (err)
> > goto skip;
> > lock_page();
> > ...
> > unlock_page();
> > skip:
> > list_del();
> > __free_pages();
> > 
> > IMO, it's more neat, so if you have any objection, let me know.
> > Otherwise, I'll handle this by myself. :)
> > Thanks,
> > 
> > > - lock_page(page);
> > > - if (unlikely(!PageUptodate(page))) {
> > > - err = -EIO;
> > > - } else {
> > > - rn = F2FS_NODE(page);
> > > - sum_entry->nid = rn->footer.nid;
> > > - sum_entry->version = 0;
> > > - sum_entry->ofs_in_node = 0;
> > > - sum_entry++;
> > > + if (!err) {
> 
> If we skip here, next round we will fill these summary page entries with
> wrong info because we skip the code 'sum_entry++;'.

There is no next round. Once err = -EIO, there's no route to make err =
0.

> 
> > > + lock_page(page);
> > > + if (unlikely(!PageUptodate(page))) {
> > > + err = -EIO;
> > > + } else {
> > > + rn = F2FS_NODE(page);
> > > + sum_entry->nid = rn->footer.nid;
> > > + sum_entry->version = 0;
> > > + sum_entry->ofs_in_node = 0;
> > > + sum_entry++;
> > > + }
> > > + unlock_page(page);
> > >   }
> > >
> > >   list_del(>lru);
> > > - unlock_page(page);
> > >   __free_pages(page, 

Re: [PATCH 1/2] cpufreq: Return error if ->get() failed in cpufreq_update_policy()

2014-03-09 Thread Viresh Kumar
On 26 February 2014 13:15, Viresh Kumar  wrote:
> On 26 February 2014 03:59, Rafael J. Wysocki  wrote:
>> Yes, what exactly do we need it for in the core?
>
> Its probably there to make things faster. We cache the value so that we
> don't go to the hardware to read/calculate that again. Isn't it?
>
> And we need to know current freq on many occasions. One of that is that
> many drivers need to know the relation between current and new freq before
> they can make the change. As they might need to play with volt regulators
> before or after the freq change. Also it is used mainly in our 
> loops_per_jiffiy
> calculations.

Ping!!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [f2fs-dev] [PATCH 5/5] f2fs: add a wait queue to avoid unnecessary, build_free_nid

2014-03-09 Thread Gu Zheng
Hi Changman,
On 03/10/2014 12:09 PM, Changman Lee wrote:

> On 금, 2014-03-07 at 18:43 +0800, Gu Zheng wrote:
>> Previously, when we try to alloc free nid while the build free nid
>> is going, the allocer will be run into the flow that waiting for
>> "nm_i->build_lock", see following:
>>  /* We should not use stale free nids created by build_free_nids */
>> >if (nm_i->fcnt && !on_build_free_nids(nm_i)) {
>>  f2fs_bug_on(list_empty(_i->free_nid_list));
>>  list_for_each(this, _i->free_nid_list) {
>>  i = list_entry(this, struct free_nid, list);
>>  if (i->state == NID_NEW)
>>  break;
>>  }
>>
>>  f2fs_bug_on(i->state != NID_NEW);
>>  *nid = i->nid;
>>  i->state = NID_ALLOC;
>>  nm_i->fcnt--;
>>  spin_unlock(_i->free_nid_list_lock);
>>  return true;
>>  }
>>  spin_unlock(_i->free_nid_list_lock);
>>
>>  /* Let's scan nat pages and its caches to get free nids */
>> >mutex_lock(_i->build_lock);
>>  build_free_nids(sbi);
>>  mutex_unlock(_i->build_lock);
>> and this will cause another unnecessary building free nid if the current
>> building free nid job is done.
>> So here we introduce a wait_queue to avoid this issue.
>>
>> Signed-off-by: Gu Zheng 
>> ---
>>  fs/f2fs/f2fs.h |1 +
>>  fs/f2fs/node.c |   10 +-
>>  2 files changed, 10 insertions(+), 1 deletions(-)
>>
>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>> index f845e92..7ae193e 100644
>> --- a/fs/f2fs/f2fs.h
>> +++ b/fs/f2fs/f2fs.h
>> @@ -256,6 +256,7 @@ struct f2fs_nm_info {
>>  spinlock_t free_nid_list_lock;  /* protect free nid list */
>>  unsigned int fcnt;  /* the number of free node id */
>>  struct mutex build_lock;/* lock for build free nids */
>> +wait_queue_head_t build_wq; /* wait queue for build free nids */
>>  
>>  /* for checkpoint */
>>  char *nat_bitmap;   /* NAT bitmap pointer */
>> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
>> index 4b7861d..ab44711 100644
>> --- a/fs/f2fs/node.c
>> +++ b/fs/f2fs/node.c
>> @@ -1422,7 +1422,13 @@ retry:
>>  spin_lock(_i->free_nid_list_lock);
>>  
>>  /* We should not use stale free nids created by build_free_nids */
>> -if (nm_i->fcnt && !on_build_free_nids(nm_i)) {
>> +if (on_build_free_nids(nm_i)) {
>> +spin_unlock(_i->free_nid_list_lock);
>> +wait_event(nm_i->build_wq, !on_build_free_nids(nm_i));
>> +goto retry;
>> +}
>> +
> 
> It would be better moving spin_lock(free_nid_list_lock) here after
> removing above spin_unlock().

Agree. It's better to place spin_lock here to avoid needless lock protection.

Regards,
Gu

> 
>> +if (nm_i->fcnt) {
>>  f2fs_bug_on(list_empty(_i->free_nid_list));
>>  list_for_each(this, _i->free_nid_list) {
>>  i = list_entry(this, struct free_nid, list);
>> @@ -1443,6 +1449,7 @@ retry:
>>  mutex_lock(_i->build_lock);
>>  build_free_nids(sbi);
>>  mutex_unlock(_i->build_lock);
>> +wake_up_all(_i->build_wq);
>>  goto retry;
>>  }
>>  
>> @@ -1813,6 +1820,7 @@ static int init_node_manager(struct f2fs_sb_info *sbi)
>>  INIT_LIST_HEAD(_i->dirty_nat_entries);
>>  
>>  mutex_init(_i->build_lock);
>> +init_waitqueue_head(_i->build_wq);
>>  spin_lock_init(_i->free_nid_list_lock);
>>  rwlock_init(_i->nat_tree_lock);
>>  
> 
> 
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[V2 PATCH 0/2] Bug fix in aio ring page migration

2014-03-09 Thread Tang Chen

This patch-set fixes the following two problems:

1. Need to use ctx->completion_lock to protect ring pages
   from being mis-written while migration.

2. Need memory barrier to ensure memory copy is done before
   ctx->ring_pages[] is updated.

NOTE: AIO ring page migration was implemented since Linux 3.12.
  So we need to merge these two patches into 3.12 stable tree.

Tang Chen (2):
  aio, memory-hotplug: Fix confliction when migrating and accessing
ring pages.
  aio, mem-hotplug: Add memory barrier to aio ring page migration.

 fs/aio.c |   42 ++
 1 files changed, 42 insertions(+), 0 deletions(-)

--
1.7.7
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[V2 PATCH 2/2] aio, mem-hotplug: Add memory barrier to aio ring page migration.

2014-03-09 Thread Tang Chen

When doing aio ring page migration, we migrated the page, and update
ctx->ring_pages[]. Like the following:

aio_migratepage()
 |-> migrate_page_copy(new, old)
 |   .. /* Need barrier here */
 |-> ctx->ring_pages[idx] = new

Actually, we need a memory barrier between these two operations.
Otherwise, if ctx->ring_pages[] is updated before memory copy due to
the compiler optimization, other processes may have an opportunity
to access to the not fully initialized new ring page.

So add a wmb and rmb to synchronize them.

v2:
  change smp_rmb() to smp_read_barrier_depends(). Thanks Miao.

Signed-off-by: Yasuaki Ishimatsu 
Signed-off-by: Tang Chen 
---
 fs/aio.c |   14 ++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index dc70246..4133ba9 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -327,6 +327,14 @@ static int aio_migratepage(struct address_space 
*mapping, struct page *new,

pgoff_t idx;
spin_lock_irqsave(>completion_lock, flags);
migrate_page_copy(new, old);
+
+   /*
+* Ensure memory copy is finished before updating
+* ctx->ring_pages[]. Otherwise other processes may access to
+* new ring pages which are not fully initialized.
+*/
+   smp_wmb();
+
idx = old->index;
if (idx < (pgoff_t)ctx->nr_pages) {
/* And only do the move if things haven't changed */
@@ -1069,6 +1077,12 @@ static long aio_read_events_ring(struct kioctx *ctx,
page = ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE];
pos %= AIO_EVENTS_PER_PAGE;

+   /*
+* Ensure that the page's data was copied from old one by
+* aio_migratepage().
+*/
+   smp_read_barrier_depends();
+
ev = kmap(page);
copy_ret = copy_to_user(event + ret, ev + pos,
sizeof(*ev) * avail);
--
1.7.7


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[V2 PATCH 1/2] aio, memory-hotplug: Fix confliction when migrating and, accessing ring pages

2014-03-09 Thread Tang Chen

AIO ring page migration has been implemented by the following patch:


https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/fs/aio.c?id=36bc08cc01709b4a9bb563b35aa530241ddc63e3

In this patch, ctx->completion_lock is used to prevent other processes
from accessing the ring page being migrated.

But in aio_setup_ring(), ioctx_add_table() and aio_read_events_ring(),
when writing to the ring page, they didn't take ctx->completion_lock.

As a result, for example, we have the following problem:

thread 1  |  thread 2
  |
aio_migratepage() |
 |-> take ctx->completion_lock|
 |-> migrate_page_copy(new, old)  |
 |   *NOW*, ctx->ring_pages[idx] == old   |
  |
  |*NOW*, 
ctx->ring_pages[idx] == old

  |aio_read_events_ring()
  | |-> ring = 
kmap_atomic(ctx->ring_pages[0])
  | |-> ring->head = head; 
 *HERE, write to the old ring page*

  | |-> kunmap_atomic(ring);
  |
 |-> ctx->ring_pages[idx] = new   |
 |   *BUT NOW*, the content of|
 |ring_pages[idx] is old. |
 |-> release ctx->completion_lock |

As above, the new ring page will not be updated.

The solution is taking ctx->completion_lock in thread 2, which means,
in aio_setup_ring(), ioctx_add_table() and aio_read_events_ring() when
writing to ring pages.

v2:
  Use spin_lock_irq rather than spin_lock_irqsave as Jeff suggested.

Reported-by: Yasuaki Ishimatsu 
Reviewed-by: Jeff Moyer 
Signed-off-by: Tang Chen 
---
 fs/aio.c |   28 
 1 files changed, 28 insertions(+), 0 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index 062a5f6..dc70246 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -437,6 +437,14 @@ static int aio_setup_ring(struct kioctx *ctx)
ctx->user_id = ctx->mmap_base;
ctx->nr_events = nr_events; /* trusted copy */

+   /*
+* The aio ring pages are user space pages, so they can be migrated.
+* When writing to an aio ring page, we should ensure the page is not
+* being migrated. Aio page migration procedure is protected by
+* ctx->completion_lock, so we add this lock here.
+*/
+   spin_lock_irq(>completion_lock);
+
ring = kmap_atomic(ctx->ring_pages[0]);
ring->nr = nr_events;/* user copy */
ring->id = ~0U;
@@ -448,6 +456,8 @@ static int aio_setup_ring(struct kioctx *ctx)
kunmap_atomic(ring);
flush_dcache_page(ctx->ring_pages[0]);

+   spin_unlock_irq(>completion_lock);
+
return 0;
 }

@@ -556,9 +566,17 @@ static int ioctx_add_table(struct kioctx *ctx, 
struct mm_struct *mm)

rcu_read_unlock();
spin_unlock(>ioctx_lock);

+   /*
+* Accessing ring pages must be done
+* holding ctx->completion_lock to
+* prevent aio ring page migration
+* procedure from migrating ring pages.
+*/
+   spin_lock_irq(>completion_lock);
ring = kmap_atomic(ctx->ring_pages[0]);
ring->id = ctx->id;
kunmap_atomic(ring);
+   spin_unlock_irq(>completion_lock);
return 0;
}

@@ -1066,11 +1084,21 @@ static long aio_read_events_ring(struct kioctx *ctx,
head %= ctx->nr_events;
}

+   /*
+* The aio ring pages are user space pages, so they can be migrated.
+* When writing to an aio ring page, we should ensure the page is not
+* being migrated. Aio page migration procedure is protected by
+* ctx->completion_lock, so we add this lock here.
+*/
+   spin_lock_irq(>completion_lock);
+
ring = kmap_atomic(ctx->ring_pages[0]);
ring->head = head;
kunmap_atomic(ring);
flush_dcache_page(ctx->ring_pages[0]);

+   spin_unlock_irq(>completion_lock);
+
pr_debug("%li  h%u t%u\n", ret, head, tail);

put_reqs_available(ctx, ret);
--
1.7.7


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net V2] vhost: net: switch to use data copy if pending DMAs exceed the limit

2014-03-09 Thread Jason Wang
On 03/08/2014 05:39 AM, David Miller wrote:
> From: Jason Wang 
> Date: Fri,  7 Mar 2014 13:28:27 +0800
>
>> This is because the delay added by htb may lead the delay the finish
>> of DMAs and cause the pending DMAs for tap0 exceeds the limit
>> (VHOST_MAX_PEND). In this case vhost stop handling tx request until
>> htb send some packets. The problem here is all of the packets
>> transmission were blocked even if it does not go to VM2.
> Isn't this essentially head of line blocking?

Yes it is.
>> We can solve this issue by relaxing it a little bit: switching to use
>> data copy instead of stopping tx when the number of pending DMAs
>> exceed half of the vq size. This is safe because:
>>
>> - The number of pending DMAs were still limited (half of the vq size)
>> - The out of order completion during mode switch can make sure that
>>   most of the tx buffers were freed in time in guest.
>>
>> So even if about 50% packets were delayed in zero-copy case, vhost
>> could continue to do the transmission through data copy in this case.
>>
>> Test result:
>>
>> Before this patch:
>> VM1 to VM2 throughput is 9.3Mbit/s
>> VM1 to External throughput is 40Mbit/s
>> CPU utilization is 7%
>>
>> After this patch:
>> VM1 to VM2 throughput is 9.3Mbit/s
>> Vm1 to External throughput is 93Mbit/s
>> CPU utilization is 16%
>>
>> Completed performance test on 40gbe shows no obvious changes in both
>> throughput and cpu utilization with this patch.
>>
>> The patch only solve this issue when unlimited sndbuf. We still need a
>> solution for limited sndbuf.
>>
>> Cc: Michael S. Tsirkin 
>> Cc: Qin Chuanyu 
>> Signed-off-by: Jason Wang 
> I'd like some vhost experts reviewing this before I apply it.

Sure.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [f2fs-dev] [PATCH 4/5] f2fs: optimize restore_node_summary slightly

2014-03-09 Thread Chao Yu
Hi Gu, Kim:

One more comment.

> -Original Message-
> From: Jaegeuk Kim [mailto:jaegeuk@samsung.com]
> Sent: Monday, March 10, 2014 12:46 PM
> To: Gu Zheng
> Cc: linux-kernel; f2fs
> Subject: Re: [f2fs-dev] [PATCH 4/5] f2fs: optimize restore_node_summary 
> slightly
> 
> Hi Gu,
> 
> 2014-03-07 (금), 18:43 +0800, Gu Zheng:
> > Previously, we ra_sum_pages to pre-read contiguous pages as more
> > as possible, and if we fail to alloc more pages, an ENOMEM error
> > will be reported upstream, even though we have alloced some pages
> > yet. In fact, we can use the available pages to do the job partly,
> > and continue the rest in the following circle. Only reporting ENOMEM
> > upstream if we really can not alloc any available page.
> >
> > And another fix is ignoring dealing with the following pages if an
> > EIO occurs when reading page from page_list.
> >
> > Signed-off-by: Gu Zheng 

Reviewed-by: Chao Yu 

> > ---
> >  fs/f2fs/node.c|   44 
> >  fs/f2fs/segment.c |7 +--
> >  2 files changed, 25 insertions(+), 26 deletions(-)
> >
> > diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> > index 8787469..4b7861d 100644
> > --- a/fs/f2fs/node.c
> > +++ b/fs/f2fs/node.c
> > @@ -1588,15 +1588,8 @@ static int ra_sum_pages(struct f2fs_sb_info *sbi, 
> > struct list_head
> *pages,
> > for (; page_idx < start + nrpages; page_idx++) {
> > /* alloc temporal page for read node summary info*/
> > page = alloc_page(GFP_F2FS_ZERO);
> > -   if (!page) {
> > -   struct page *tmp;
> > -   list_for_each_entry_safe(page, tmp, pages, lru) {
> > -   list_del(>lru);
> > -   unlock_page(page);
> > -   __free_pages(page, 0);
> > -   }
> > -   return -ENOMEM;
> > -   }
> > +   if (!page)
> > +   break;
> >
> > lock_page(page);
> > page->index = page_idx;
> > @@ -1607,7 +1600,8 @@ static int ra_sum_pages(struct f2fs_sb_info *sbi, 
> > struct list_head
> *pages,
> > f2fs_submit_page_mbio(sbi, page, page->index, );
> >
> > f2fs_submit_merged_bio(sbi, META, READ);
> > -   return 0;
> > +
> > +   return page_idx - start;
> >  }
> >
> >  int restore_node_summary(struct f2fs_sb_info *sbi,
> > @@ -1630,28 +1624,30 @@ int restore_node_summary(struct f2fs_sb_info *sbi,
> > nrpages = min(last_offset - i, bio_blocks);
> >
> > /* read ahead node pages */
> > -   err = ra_sum_pages(sbi, _list, addr, nrpages);
> > -   if (err)
> > -   return err;
> > +   nrpages = ra_sum_pages(sbi, _list, addr, nrpages);
> > +   if (!nrpages)
> > +   return -ENOMEM;
> >
> > list_for_each_entry_safe(page, tmp, _list, lru) {
> > -
> 
> Here we can just add:
>   if (err)
>   goto skip;
>   lock_page();
>   ...
>   unlock_page();
>   skip:
>   list_del();
>   __free_pages();
> 
> IMO, it's more neat, so if you have any objection, let me know.
> Otherwise, I'll handle this by myself. :)
> Thanks,
> 
> > -   lock_page(page);
> > -   if (unlikely(!PageUptodate(page))) {
> > -   err = -EIO;
> > -   } else {
> > -   rn = F2FS_NODE(page);
> > -   sum_entry->nid = rn->footer.nid;
> > -   sum_entry->version = 0;
> > -   sum_entry->ofs_in_node = 0;
> > -   sum_entry++;
> > +   if (!err) {

If we skip here, next round we will fill these summary page entries with
wrong info because we skip the code 'sum_entry++;'.

> > +   lock_page(page);
> > +   if (unlikely(!PageUptodate(page))) {
> > +   err = -EIO;
> > +   } else {
> > +   rn = F2FS_NODE(page);
> > +   sum_entry->nid = rn->footer.nid;
> > +   sum_entry->version = 0;
> > +   sum_entry->ofs_in_node = 0;
> > +   sum_entry++;
> > +   }
> > +   unlock_page(page);
> > }
> >
> > list_del(>lru);
> > -   unlock_page(page);
> > __free_pages(page, 0);
> > }

Maybe we should add code here.
if (err)
return err;

> > }
> > +
> > return err;
> >  }
> >
> > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> > index 

[PATCH][RFC] mm: warning message for vm_map_ram about vm size

2014-03-09 Thread Gioh Kim
Hi,

I have a failure of allocation of virtual memory on ARMv7 based platform.

I called alloc_page()/vm_map_ram() for allocation/mapping pages.
Virtual memory space exhausting problem occurred.
I checked virtual memory space and found that there are too many 4MB chunks.

I thought that if just one page in the 4MB chunk lives long, 
the entire chunk cannot be freed. Therefore new chunk is created again and 
again.

In my opinion, the vm_map_ram() function should be used for temporary mapping
and/or short term memory mapping. Otherwise virtual memory is wasted.

I am not sure if my opinion is correct. If it is, please add some warning 
message
about the vm_map_ram().



---8<---

Subject: [PATCH] mm: warning comment for vm_map_ram

vm_map_ram can occur locking of virtual memory space
because if only one page lives long in one vmap_block,
it takes 4MB (1024-times more than one page) space.

Change-Id: I6f5919848cf03788b5846b7d850d66e4d93ac39a
Signed-off-by: Gioh Kim 
---
 mm/vmalloc.c |4 
 1 file changed, 4 insertions(+)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 0fdf968..2de1d1b 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1083,6 +1083,10 @@ EXPORT_SYMBOL(vm_unmap_ram);
  * @node: prefer to allocate data structures on this node
  * @prot: memory protection to use. PAGE_KERNEL for regular RAM
  *
+ * This function should be used for TEMPORARY mapping. If just one page lives i
+ * long, it would occupy 4MB vm size permamently. 100 pages (just 400KB) could
+ * takes 400MB with bad luck.
+ *
  * Returns: a pointer to the address that has been mapped, or %NULL on failure
  */
 void *vm_map_ram(struct page **pages, unsigned int count, int node, pgprot_t 
prot)
--
1.7.9.5

Gioh Kim / 김 기 오
Research Engineer
Advanced OS Technology Team
Software Platform R Lab.
Mobile: 82-10-7322-5548  
E-mail: gioh@lge.com 
19, Yangjae-daero 11gil
Seocho-gu, Seoul 137-130, Korea


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/5] f2fs: add a wait queue to avoid unnecessary, build_free_nid

2014-03-09 Thread Jaegeuk Kim
Hi Gu,

2014-03-07 (금), 18:43 +0800, Gu Zheng:
> Previously, when we try to alloc free nid while the build free nid
> is going, the allocer will be run into the flow that waiting for
> "nm_i->build_lock", see following:
>   /* We should not use stale free nids created by build_free_nids */
> > if (nm_i->fcnt && !on_build_free_nids(nm_i)) {
>   f2fs_bug_on(list_empty(_i->free_nid_list));
>   list_for_each(this, _i->free_nid_list) {
>   i = list_entry(this, struct free_nid, list);
>   if (i->state == NID_NEW)
>   break;
>   }
> 
>   f2fs_bug_on(i->state != NID_NEW);
>   *nid = i->nid;
>   i->state = NID_ALLOC;
>   nm_i->fcnt--;
>   spin_unlock(_i->free_nid_list_lock);
>   return true;
>   }
>   spin_unlock(_i->free_nid_list_lock);
> 
>   /* Let's scan nat pages and its caches to get free nids */
> > mutex_lock(_i->build_lock);
>   build_free_nids(sbi);
>   mutex_unlock(_i->build_lock);
> and this will cause another unnecessary building free nid if the current
> building free nid job is done.

Could you support any performance number for this?
Since, IMO, the contended building processes will be released right away
because of the following condition check inside build_free_nids().

if (nm_i->fcnt > NAT_ENTRY_PER_BLOCK)
return;

So, I don't think this gives us any high latency.
Can the wakeup_all() become another overhead all the time?
Thanks,

> So here we introduce a wait_queue to avoid this issue.
> 
> Signed-off-by: Gu Zheng 
> ---
>  fs/f2fs/f2fs.h |1 +
>  fs/f2fs/node.c |   10 +-
>  2 files changed, 10 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> index f845e92..7ae193e 100644
> --- a/fs/f2fs/f2fs.h
> +++ b/fs/f2fs/f2fs.h
> @@ -256,6 +256,7 @@ struct f2fs_nm_info {
>   spinlock_t free_nid_list_lock;  /* protect free nid list */
>   unsigned int fcnt;  /* the number of free node id */
>   struct mutex build_lock;/* lock for build free nids */
> + wait_queue_head_t build_wq; /* wait queue for build free nids */
>  
>   /* for checkpoint */
>   char *nat_bitmap;   /* NAT bitmap pointer */
> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> index 4b7861d..ab44711 100644
> --- a/fs/f2fs/node.c
> +++ b/fs/f2fs/node.c
> @@ -1422,7 +1422,13 @@ retry:
>   spin_lock(_i->free_nid_list_lock);
>  
>   /* We should not use stale free nids created by build_free_nids */
> - if (nm_i->fcnt && !on_build_free_nids(nm_i)) {
> + if (on_build_free_nids(nm_i)) {
> + spin_unlock(_i->free_nid_list_lock);
> + wait_event(nm_i->build_wq, !on_build_free_nids(nm_i));
> + goto retry;
> + }
> +
> + if (nm_i->fcnt) {
>   f2fs_bug_on(list_empty(_i->free_nid_list));
>   list_for_each(this, _i->free_nid_list) {
>   i = list_entry(this, struct free_nid, list);
> @@ -1443,6 +1449,7 @@ retry:
>   mutex_lock(_i->build_lock);
>   build_free_nids(sbi);
>   mutex_unlock(_i->build_lock);
> + wake_up_all(_i->build_wq);
>   goto retry;
>  }
>  
> @@ -1813,6 +1820,7 @@ static int init_node_manager(struct f2fs_sb_info *sbi)
>   INIT_LIST_HEAD(_i->dirty_nat_entries);
>  
>   mutex_init(_i->build_lock);
> + init_waitqueue_head(_i->build_wq);
>   spin_lock_init(_i->free_nid_list_lock);
>   rwlock_init(_i->nat_tree_lock);
>  

-- 
Jaegeuk Kim
Samsung

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/5] f2fs: optimize restore_node_summary slightly

2014-03-09 Thread Jaegeuk Kim
Hi Gu,

2014-03-07 (금), 18:43 +0800, Gu Zheng:
> Previously, we ra_sum_pages to pre-read contiguous pages as more
> as possible, and if we fail to alloc more pages, an ENOMEM error
> will be reported upstream, even though we have alloced some pages
> yet. In fact, we can use the available pages to do the job partly,
> and continue the rest in the following circle. Only reporting ENOMEM
> upstream if we really can not alloc any available page.
> 
> And another fix is ignoring dealing with the following pages if an
> EIO occurs when reading page from page_list.
> 
> Signed-off-by: Gu Zheng 
> ---
>  fs/f2fs/node.c|   44 
>  fs/f2fs/segment.c |7 +--
>  2 files changed, 25 insertions(+), 26 deletions(-)
> 
> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> index 8787469..4b7861d 100644
> --- a/fs/f2fs/node.c
> +++ b/fs/f2fs/node.c
> @@ -1588,15 +1588,8 @@ static int ra_sum_pages(struct f2fs_sb_info *sbi, 
> struct list_head *pages,
>   for (; page_idx < start + nrpages; page_idx++) {
>   /* alloc temporal page for read node summary info*/
>   page = alloc_page(GFP_F2FS_ZERO);
> - if (!page) {
> - struct page *tmp;
> - list_for_each_entry_safe(page, tmp, pages, lru) {
> - list_del(>lru);
> - unlock_page(page);
> - __free_pages(page, 0);
> - }
> - return -ENOMEM;
> - }
> + if (!page)
> + break;
>  
>   lock_page(page);
>   page->index = page_idx;
> @@ -1607,7 +1600,8 @@ static int ra_sum_pages(struct f2fs_sb_info *sbi, 
> struct list_head *pages,
>   f2fs_submit_page_mbio(sbi, page, page->index, );
>  
>   f2fs_submit_merged_bio(sbi, META, READ);
> - return 0;
> +
> + return page_idx - start;
>  }
>  
>  int restore_node_summary(struct f2fs_sb_info *sbi,
> @@ -1630,28 +1624,30 @@ int restore_node_summary(struct f2fs_sb_info *sbi,
>   nrpages = min(last_offset - i, bio_blocks);
>  
>   /* read ahead node pages */
> - err = ra_sum_pages(sbi, _list, addr, nrpages);
> - if (err)
> - return err;
> + nrpages = ra_sum_pages(sbi, _list, addr, nrpages);
> + if (!nrpages)
> + return -ENOMEM;
>  
>   list_for_each_entry_safe(page, tmp, _list, lru) {
> -

Here we can just add:
if (err)
goto skip;
lock_page();
...
unlock_page();
skip:
list_del();
__free_pages();

IMO, it's more neat, so if you have any objection, let me know.
Otherwise, I'll handle this by myself. :)
Thanks,

> - lock_page(page);
> - if (unlikely(!PageUptodate(page))) {
> - err = -EIO;
> - } else {
> - rn = F2FS_NODE(page);
> - sum_entry->nid = rn->footer.nid;
> - sum_entry->version = 0;
> - sum_entry->ofs_in_node = 0;
> - sum_entry++;
> + if (!err) {
> + lock_page(page);
> + if (unlikely(!PageUptodate(page))) {
> + err = -EIO;
> + } else {
> + rn = F2FS_NODE(page);
> + sum_entry->nid = rn->footer.nid;
> + sum_entry->version = 0;
> + sum_entry->ofs_in_node = 0;
> + sum_entry++;
> + }
> + unlock_page(page);
>   }
>  
>   list_del(>lru);
> - unlock_page(page);
>   __free_pages(page, 0);
>   }
>   }
> +
>   return err;
>  }
>  
> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> index 199c964..b3f8431 100644
> --- a/fs/f2fs/segment.c
> +++ b/fs/f2fs/segment.c
> @@ -1160,9 +1160,12 @@ static int read_normal_summaries(struct f2fs_sb_info 
> *sbi, int type)
>   ns->ofs_in_node = 0;
>   }
>   } else {
> - if (restore_node_summary(sbi, segno, sum)) {
> + int err;
> +
> + err = restore_node_summary(sbi, segno, sum);
> + if (err) {
>   f2fs_put_page(new, 1);
> - return -EINVAL;
> + return err;
>   

Re: [x86, vdso] BUG: unable to handle kernel paging request at d34bd000

2014-03-09 Thread Andy Lutomirski
On Sun, Mar 9, 2014 at 8:18 PM, Andy Lutomirski  wrote:
> (Of course, I haven't the faintest idea what l_addr in glibc means.
> If there was a way to arrange for l_addr to be zero, then maybe none
> of this would matter.  Hmm, I wonder if just not relocating the vdso
> at all would have the desired effect.  Anyone out there understand
> glibc?)

No, that won't work.  The bug is that glibc expects PT_DYNAMIC's vaddr
to be the virtual address of the dynamic table.  This can only be true
if the vdso is mapped at the address that the kernel relocated it to.

I also learned that glibc's code is really hideous.  Wow.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv8 2/4] power_supply: Introduce generic psy charging driver

2014-03-09 Thread Jenny Tc
On Fri, Mar 07, 2014 at 09:25:20PM +0100, Pavel Machek wrote:

Hi,

> > The Power Supply charging driver connects multiple subsystems
> > to do charging in a generic way. The subsystems involves power_supply,
> > thermal and battery communication subsystems (1wire).With this the charging 
> > is
> > handled in a generic way.
> 
> " " after ".", please.

Will fix in next patch set

> > +
> > +The Power Supply charging driver connects multiple subsystems
> > +to do charging in a generic way. The subsystems involves power_supply,
> > +thermal and battery communication subsystems (1wire).With this the 
> > charging is
> 
> Here too.

Same here
> 
> 
> 
> > +
> > +The driver introduces different new features - Battery Identification
> > +interfaces, pluggable charging algorithms, charger cable arbitrations etc.
> > +
> > +In existing driver implementations the charging is done based on the static
> > +battery characteristics. This is done at the boot time by passing the 
> > battery
> > +properties (max_voltage, capacity) etc. as a platform data to the
> 
> -> (max_voltage, capacity, etc.)

Same here
> 
> > --- a/drivers/power/Kconfig
> > +++ b/drivers/power/Kconfig
> > @@ -14,6 +14,14 @@ config POWER_SUPPLY_DEBUG
> >   Say Y here to enable debugging messages for power supply class
> >   and drivers.
> >  
> > +config POWER_SUPPLY_CHARGER
> > +   bool "Power Supply Charger"
> > +   help
> > + Say Y here to enable the power supply charging control driver. 
> > Charging
> > + control supports charging in a generic way. This allows the charger
> > + drivers to keep the charging logic outside and the charger driver
> > + just need to abstract the charger hardware.
> > +
> 
> Umm. This is not too helpful for our users.

Will add more text to help users.
> 
> > +struct psy_charger_context {
> > +   bool is_usb_cable_evt_reg;
> > +   int psyc_cnt;
> > +   int batt_status;
> > +   /* cache battery and charger properties */
> > +   struct list_head chrgr_cache_lst;
> > +   struct list_head batt_cache_lst;
> > +   struct mutex event_lock;
> > +   struct list_head event_queue;
> > +   struct psy_batt_chrg_prof batt_property;
> > +   wait_queue_head_t wait_chrg_enable;
> > +   spinlock_t battid_spinlock;
> > +   spinlock_t event_queue_lock;
> > +   struct work_struct event_work;
> > +};
> > +
> > +struct charger_cable {
> > +   struct psy_cable_props cable_props;
> > +   enum psy_charger_cable_type psy_cable_type;
> > +};
> > +
> > +static struct psy_charger_context psy_chrgr;
> 
> You still miss some wovels here. Sometimes it imakes it unlear: 
> chrg is charge? charger?

chrgr means charger, chrg means charge. Isn't it used consistently?. Can fix it 
if
it's really annoying. Please suggest.
> 
> 
> > +static inline bool psy_is_charger_prop_changed(struct psy_charger_props 
> > prop,
> > +   struct psy_charger_props cache_prop)
> > +{
> > +   /* if online/prsent/health/is_charging is changed, then return true */
> 
> Typo - present.

Will fix in next patch set
> 
> > +static inline void cache_chrgr_prop(struct psy_charger_props 
> > *chrgr_prop_new)
> > +{
> > +   struct psy_charger_props *chrgr_cache;
> > +
> > +   list_for_each_entry(chrgr_cache, _chrgr.chrgr_cache_lst, node) {
> > +   if (!strcmp(chrgr_cache->name, chrgr_prop_new->name))
> > +   goto update_props;
> > +   }
> 
> Interesting use of goto. Maybe update_properties should be separate function?

I feel, having a function just for few assignments may not be a good idea.
What about having memcpy?
> 
> > +update_props:
> > +   chrgr_cache->is_charging = chrgr_prop_new->is_charging;
> > +   chrgr_cache->online = chrgr_prop_new->online;
> > +   chrgr_cache->health = chrgr_prop_new->health;
> > +   chrgr_cache->present = chrgr_prop_new->present;
> > +   chrgr_cache->cable = chrgr_prop_new->cable;
> > +   chrgr_cache->tstamp = chrgr_prop_new->tstamp;
> > +   chrgr_cache->psyc = chrgr_prop_new->psyc;
> > +}
> > +
> > +   chrgr_prop.psyc = chrgr_prop_cache.psyc;
> > +   cache_chrgr_prop(_prop);
> > +   return true;
> > +}
> > +static void cache_successive_samples(long *sample_array, long new_sample)
> > +{
> 
> Add empty line between the functions.

Ok..Will fix in next patch set
> 
> > +static inline void cache_bat_prop(struct psy_batt_props *bat_prop_new)
> > +{
> > +   struct psy_batt_props *bat_cache;
> > +
> > +   /*
> > +   *  Find entry in cache list. If an entry is located update
> > +   *  the existing entry else create new entry in the list
> > +   */
> > +   list_for_each_entry(bat_cache, _chrgr.batt_cache_lst, node) {
> > +   if (!strcmp(bat_cache->name, bat_prop_new->name))
> > +   goto update_props;
> > +   }
> > +
> > +   bat_cache = kzalloc(sizeof(*bat_cache), GFP_KERNEL);
> 
> What is it with all the caching? Is it good idea to have caches
> indexed by strings? Can't you go without caching, or attach caches to
> some structure? Interesting goto again.

Cache is to store 

Re: [PATCH] cpufreq: use cpufreq_cpu_get to avoid cpufreq_get race conditions

2014-03-09 Thread Viresh Kumar
On 6 March 2014 09:23, Rafael J. Wysocki  wrote:
> On Tuesday, March 04, 2014 12:42:15 PM Aaron Plattner wrote:
>> If a module calls cpufreq_get while cpufreq is initializing, it's possible 
>> for
>> it to be called after cpufreq_driver is set but before cpufreq_cpu_data is
>> written during subsys_interface_register.  This happens because cpufreq_get
>> doesn't take the cpufreq_driver_lock around its use of cpufreq_cpu_data.
>
> Is this a theoretical race, or can you actually reproduce it?  If so, on what
> system/driver?  Or are there any bug reports related to this you can point me
> to?
>
>> Fix this by using cpufreq_cpu_get(cpu) to look up the policy rather than 
>> reading
>> it out of cpufreq_cpu_data directly.  cpufreq_cpu_get takes the appropriate
>> locks to prevent this race from happening.
>>
>> Since it's possible for policy to be NULL if the caller passes in an invalid 
>> CPU
>> number or calls the function before cpufreq is initialized, delete the
>> BUG_ON(!policy) and simply return 0.  Don't try to return -ENOENT because 
>> that's
>> negative and the function returns an unsigned integer.
>>
>> Signed-off-by: Aaron Plattner 
>
> Viresh, have you seen this?

Sorry for being late. Though I see you have already applied this one,
I will still add this for records :)

Acked-by: Viresh Kumar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Linux 3.14-rc6

2014-03-09 Thread Linus Torvalds
We're getting closer to the end of the rc cycle, and I have to admit
that I would have wished for a less bumpy ride.

There haven't been any huge problems, but there's been quite a few
small bumps that shouldn't happen this late in the release cycle. And
rc6 is noticeably bigger than rc5 was, as well.

So I'm really hoping that the upcoming week will be calmer, because
otherwise I'll start thing rc8 and even rc9..

That said, there's nothing really fundamentally scary here. Small
stupid mistakes, and a few late reverts of commits that turned out to
not be so great, but the bulk is trivial fixes. So I'm still
reasonably optimistic.

 Linus

---

Aaron Plattner (1):
  cpufreq: use cpufreq_cpu_get() to avoid cpufreq_get() race conditions

Akash Goel (1):
  drm/i915: Resolving the memory region conflict for Stolen area

Alex Deucher (4):
  drm/radeon: resume old pm late
  drm/radeon/cik: fix typo in documentation
  drm/radeon/dpm: fix typo in EVERGREEN_SMC_FIRMWARE_HEADER_softRegisters
  drm/radeon/atom: select the proper number of lanes in transmitter setup

Alexander Stein (2):
  spi/topcliff-pch: Fix DMA channel
  spi-topcliff-pch: Fix probing when DMA mode is used

Alexandre Bounine (1):
  rapidio/tsi721: fix tasklet termination in dma channel release

Amir Vadai (1):
  net,IB/mlx: Bump all Mellanox driver versions

Amitkumar Karwar (3):
  mwifiex: add NULL check for PCIe Rx skb
  mwifiex: fix cmd and Tx data timeout issue for PCIe cards
  NFC: NCI: Fix NULL pointer dereference

Andrew Bresticker (3):
  clk: tegra: fix sdmmc clks on Tegra1x4
  clk: tegra: cclk_lp has a pllx/2 divider
  clk: tegra: use max divider if divider overflows

Andy Adamson (1):
  NFSv4.1 Fail data server I/O if stateid represents a lost lock

Andy Honig (1):
  kallsyms: fix absolute addresses for kASLR

Anton Blanchard (1):
  powerpc: Align p_dyn, p_rela and p_st symbols

Arend van Spriel (1):
  brcmfmac: fix txglomming scatter-gather packet transfers

Arik Nemtsov (1):
  mac80211: fix sched_scan restart on recovery

Asai Thambi S P (1):
  mtip32xx: Reduce the number of unaligned writes to 2

Avinash Patil (1):
  mwifiex: clean pcie ring only when device is present

Axel Lin (2):
  spi: fsl-dspi: Fix getting correct address for master
  spi: coldfire-qspi: Fix getting correct address for *mcfqspi

Barry Song (1):
  pinctrl: sirf: fix kernel panic in gpio_lock_as_irq

Benoit Cousson (1):
  clk: shmobile: rcar-gen2: Use kick bit to allow Z clock frequency change

Bing Zhao (2):
  mwifiex: rename usb driver name registerring to usb core
  mwifiex: do not advertise usb autosuspend support

Borislav Petkov (2):
  MAINTAINERS: EDAC: add Mauro and Borislav as interim patch collectors
  x86/efi: Quirk out SGI UV

Chen-Yu Tsai (1):
  pinctrl: sunxi: use chained_irq_{enter, exit} for GIC compatibility

Christian Daudt (1):
  pinctrl: refer to updated dt binding string.

Christoph Hellwig (3):
  blk-mq: remove blk_mq_alloc_rq
  blk-mq: merge blk_mq_insert_request and blk_mq_run_request
  blk-mq: support partial I/O completions

Chuansheng Liu (1):
  genirq: Remove racy waitqueue_active check

Cristian Bercaru (1):
  phy: unmask link partner capabilities

Dan Carpenter (2):
  hsr: off by one sanity check in hsr_register_frame_in()
  qlcnic: dcb: a couple off by one bugs

Dan Williams (1):
  dma debug: account for cachelines and read-only mappings in
overlap tracking

Daniel Borkmann (2):
  net: sctp: rework multihoming retransmission path selection to rfc4960
  net: sctp: fix sctp_sf_do_5_1D_ce to verify if we/peer is AUTH capable

Daniel M. Weeks (1):
  scripts/gen_initramfs_list.sh: fix flags for initramfs LZ4 compression

Dave Airlie (1):
  MAINTAINERS: update AGP tree to point at drm tree

David A. Long (1):
  ARM: 7964/1: Detect section mismatches in thumb relocations

David Howells (1):
  KEYS: Make the keyring cycle detector ignore other keyrings of
the same name

David Rientjes (1):
  mm: close PageTail race

David S. Miller (1):
  ip_tunnel: Move ip_tunnel_get_stats64 into ip_tunnel_core.c

David Ung (1):
  clk: tegra: PLLD2 fixes for hdmi

Ditang Chen (1):
  SUNRPC: Fix oops when trace sunrpc_task events in nfs client

Duan Fugang-B38611 (1):
  net: fec: fix potential issue to avoid fec interrupt lost and crc error

Duan Jiong (1):
  neigh: recompute reachabletime before returning from neigh_periodic_work()

Edward Cree (1):
  sfc: check for NULL efx->ptp_data in efx_ptp_event

Emmanuel Grumbach (2):
  mac80211: fix AP powersave TX vs. wakeup race
  iwlwifi: dvm: clear IWL_STA_UCODE_INPROGRESS when assoc fails

Eric Dumazet (3):
  net-tcp: fastopen: fix high order allocations
  tcp: reduce the bloat caused by tcp_is_cwnd_limited()
  net: tcp: use NET_INC_STATS()

Eytan Lifshitz (1):
  

Re: [PATCH] cpufreq: Reformat printk() statement

2014-03-09 Thread Viresh Kumar
On 10 March 2014 12:07, Joe Perches  wrote:
> On Mon, 2014-03-10 at 11:53 +0800, Viresh Kumar wrote:
>> On 7 March 2014 01:34, Soren Brinkmann  wrote:
>> > Reformat a printk statement to:
>> >  - use pr_warn
>> >  - bring the whole string into a single line in favor of being able to
>> >grep for the message (ignoring the 80 char limit)
>> >
>> > Signed-off-by: Soren Brinkmann 
> []
>> Acked-by: Viresh Kumar 
>
> A more comprehensive patch would be to:
>
> Add missing newlines
> Coalesce format fragments
> Convert printks to pr_
> Align arguments
> ---
>  drivers/cpufreq/cpufreq.c | 85 
> ++-
>  1 file changed, 40 insertions(+), 45 deletions(-)

Thanks..

Acked-by: Viresh Kumar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [f2fs-dev] [PATCH 5/5] f2fs: add a wait queue to avoid unnecessary, build_free_nid

2014-03-09 Thread Changman Lee
On 금, 2014-03-07 at 18:43 +0800, Gu Zheng wrote:
> Previously, when we try to alloc free nid while the build free nid
> is going, the allocer will be run into the flow that waiting for
> "nm_i->build_lock", see following:
>   /* We should not use stale free nids created by build_free_nids */
> > if (nm_i->fcnt && !on_build_free_nids(nm_i)) {
>   f2fs_bug_on(list_empty(_i->free_nid_list));
>   list_for_each(this, _i->free_nid_list) {
>   i = list_entry(this, struct free_nid, list);
>   if (i->state == NID_NEW)
>   break;
>   }
> 
>   f2fs_bug_on(i->state != NID_NEW);
>   *nid = i->nid;
>   i->state = NID_ALLOC;
>   nm_i->fcnt--;
>   spin_unlock(_i->free_nid_list_lock);
>   return true;
>   }
>   spin_unlock(_i->free_nid_list_lock);
> 
>   /* Let's scan nat pages and its caches to get free nids */
> > mutex_lock(_i->build_lock);
>   build_free_nids(sbi);
>   mutex_unlock(_i->build_lock);
> and this will cause another unnecessary building free nid if the current
> building free nid job is done.
> So here we introduce a wait_queue to avoid this issue.
> 
> Signed-off-by: Gu Zheng 
> ---
>  fs/f2fs/f2fs.h |1 +
>  fs/f2fs/node.c |   10 +-
>  2 files changed, 10 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> index f845e92..7ae193e 100644
> --- a/fs/f2fs/f2fs.h
> +++ b/fs/f2fs/f2fs.h
> @@ -256,6 +256,7 @@ struct f2fs_nm_info {
>   spinlock_t free_nid_list_lock;  /* protect free nid list */
>   unsigned int fcnt;  /* the number of free node id */
>   struct mutex build_lock;/* lock for build free nids */
> + wait_queue_head_t build_wq; /* wait queue for build free nids */
>  
>   /* for checkpoint */
>   char *nat_bitmap;   /* NAT bitmap pointer */
> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> index 4b7861d..ab44711 100644
> --- a/fs/f2fs/node.c
> +++ b/fs/f2fs/node.c
> @@ -1422,7 +1422,13 @@ retry:
>   spin_lock(_i->free_nid_list_lock);
>  
>   /* We should not use stale free nids created by build_free_nids */
> - if (nm_i->fcnt && !on_build_free_nids(nm_i)) {
> + if (on_build_free_nids(nm_i)) {
> + spin_unlock(_i->free_nid_list_lock);
> + wait_event(nm_i->build_wq, !on_build_free_nids(nm_i));
> + goto retry;
> + }
> +

It would be better moving spin_lock(free_nid_list_lock) here after
removing above spin_unlock().

> + if (nm_i->fcnt) {
>   f2fs_bug_on(list_empty(_i->free_nid_list));
>   list_for_each(this, _i->free_nid_list) {
>   i = list_entry(this, struct free_nid, list);
> @@ -1443,6 +1449,7 @@ retry:
>   mutex_lock(_i->build_lock);
>   build_free_nids(sbi);
>   mutex_unlock(_i->build_lock);
> + wake_up_all(_i->build_wq);
>   goto retry;
>  }
>  
> @@ -1813,6 +1820,7 @@ static int init_node_manager(struct f2fs_sb_info *sbi)
>   INIT_LIST_HEAD(_i->dirty_nat_entries);
>  
>   mutex_init(_i->build_lock);
> + init_waitqueue_head(_i->build_wq);
>   spin_lock_init(_i->free_nid_list_lock);
>   rwlock_init(_i->nat_tree_lock);
>  


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] cpufreq: Reformat printk() statement

2014-03-09 Thread Joe Perches
On Mon, 2014-03-10 at 11:53 +0800, Viresh Kumar wrote:
> On 7 March 2014 01:34, Soren Brinkmann  wrote:
> > Reformat a printk statement to:
> >  - use pr_warn
> >  - bring the whole string into a single line in favor of being able to
> >grep for the message (ignoring the 80 char limit)
> >
> > Signed-off-by: Soren Brinkmann 
[]
> Acked-by: Viresh Kumar 

A more comprehensive patch would be to:

Add missing newlines
Coalesce format fragments
Convert printks to pr_
Align arguments
---
 drivers/cpufreq/cpufreq.c | 85 ++-
 1 file changed, 40 insertions(+), 45 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 56b7b1b..c2d06e9 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -181,8 +181,8 @@ unsigned int cpufreq_generic_get(unsigned int cpu)
struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu);
 
if (!policy || IS_ERR(policy->clk)) {
-   pr_err("%s: No %s associated to cpu: %d\n", __func__,
-   policy ? "clk" : "policy", cpu);
+   pr_err("%s: No %s associated to cpu: %d\n",
+  __func__, policy ? "clk" : "policy", cpu);
return 0;
}
 
@@ -254,15 +254,15 @@ static void adjust_jiffies(unsigned long val, struct 
cpufreq_freqs *ci)
if (!l_p_j_ref_freq) {
l_p_j_ref = loops_per_jiffy;
l_p_j_ref_freq = ci->old;
-   pr_debug("saving %lu as reference value for loops_per_jiffy; "
-   "freq is %u kHz\n", l_p_j_ref, l_p_j_ref_freq);
+   pr_debug("saving %lu as reference value for loops_per_jiffy; 
freq is %u kHz\n",
+l_p_j_ref, l_p_j_ref_freq);
}
if ((val == CPUFREQ_POSTCHANGE && ci->old != ci->new) ||
(val == CPUFREQ_RESUMECHANGE || val == CPUFREQ_SUSPENDCHANGE)) {
loops_per_jiffy = cpufreq_scale(l_p_j_ref, l_p_j_ref_freq,
ci->new);
-   pr_debug("scaling loops_per_jiffy to %lu "
-   "for frequency %u kHz\n", loops_per_jiffy, ci->new);
+   pr_debug("scaling loops_per_jiffy to %lu for frequency %u 
kHz\n",
+loops_per_jiffy, ci->new);
}
 }
 #else
@@ -282,7 +282,7 @@ static void __cpufreq_notify_transition(struct 
cpufreq_policy *policy,
 
freqs->flags = cpufreq_driver->flags;
pr_debug("notification %u of frequency transition to %u kHz\n",
-   state, freqs->new);
+state, freqs->new);
 
switch (state) {
 
@@ -294,9 +294,8 @@ static void __cpufreq_notify_transition(struct 
cpufreq_policy *policy,
if (!(cpufreq_driver->flags & CPUFREQ_CONST_LOOPS)) {
if ((policy) && (policy->cpu == freqs->cpu) &&
(policy->cur) && (policy->cur != freqs->old)) {
-   pr_debug("Warning: CPU frequency is"
-   " %u, cpufreq assumed %u kHz.\n",
-   freqs->old, policy->cur);
+   pr_debug("Warning: CPU frequency is %u, cpufreq 
assumed %u kHz\n",
+freqs->old, policy->cur);
freqs->old = policy->cur;
}
}
@@ -307,8 +306,8 @@ static void __cpufreq_notify_transition(struct 
cpufreq_policy *policy,
 
case CPUFREQ_POSTCHANGE:
adjust_jiffies(CPUFREQ_POSTCHANGE, freqs);
-   pr_debug("FREQ: %lu - CPU: %lu", (unsigned long)freqs->new,
-   (unsigned long)freqs->cpu);
+   pr_debug("FREQ: %lu - CPU: %lu\n",
+(unsigned long)freqs->new, (unsigned long)freqs->cpu);
trace_cpu_frequency(freqs->new, freqs->cpu);
srcu_notifier_call_chain(_transition_notifier_list,
CPUFREQ_POSTCHANGE, freqs);
@@ -368,13 +367,13 @@ static ssize_t store_boost(struct kobject *kobj, struct 
attribute *attr,
return -EINVAL;
 
if (cpufreq_boost_trigger_state(enable)) {
-   pr_err("%s: Cannot %s BOOST!\n", __func__,
-  enable ? "enable" : "disable");
+   pr_err("%s: Cannot %s BOOST!\n",
+  __func__, enable ? "enable" : "disable");
return -EINVAL;
}
 
-   pr_debug("%s: cpufreq BOOST %s\n", __func__,
-enable ? "enabled" : "disabled");
+   pr_debug("%s: cpufreq BOOST %s\n",
+__func__, enable ? "enabled" : "disabled");
 
return count;
 }
@@ -1184,7 +1183,7 @@ static int __cpufreq_add_dev(struct device *dev, struct 
subsys_interface *sif,
if (gov) {
policy->governor = gov;

Re: [PATCH] cpufreq: Reformat printk() statement

2014-03-09 Thread Viresh Kumar
On 7 March 2014 01:34, Soren Brinkmann  wrote:
> Reformat a printk statement to:
>  - use pr_warn
>  - bring the whole string into a single line in favor of being able to
>grep for the message (ignoring the 80 char limit)
>
> Signed-off-by: Soren Brinkmann 
> ---
>  drivers/cpufreq/cpufreq.c | 7 ++-
>  1 file changed, 2 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index cb003a6b72c8..534c2df608ed 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1874,11 +1874,8 @@ static int __cpufreq_governor(struct cpufreq_policy 
> *policy,
> if (!gov)
> return -EINVAL;
> else {
> -   printk(KERN_WARNING "%s governor failed, too long"
> -  " transition latency of HW, fallback"
> -  " to %s governor\n",
> -  policy->governor->name,
> -  gov->name);
> +   pr_warn("%s governor failed, too long transition 
> latency of HW, fallback to %s governor\n",
> +   policy->governor->name, gov->name);
> policy->governor = gov;
> }
> }

Acked-by: Viresh Kumar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [TRIVIAL] ARM: spear: clean up editing mistake

2014-03-09 Thread Viresh Kumar
On 10 March 2014 05:41, Paul Bolle  wrote:
> Clean up an obvious editing mistake introduced by commit 4b6effb6ff38
> ("ARM: spear: merge Kconfig files").
>
> Signed-off-by: Paul Bolle 
> ---
>  arch/arm/mach-spear/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/arm/mach-spear/Kconfig b/arch/arm/mach-spear/Kconfig
> index 1595776..596e990 100644
> --- a/arch/arm/mach-spear/Kconfig
> +++ b/arch/arm/mach-spear/Kconfig
> @@ -91,7 +91,7 @@ config MACH_SPEAR600
> depends on ARCH_SPEAR6XX
> select USE_OF
> help
> - Supports ST SPEAr600 boards configured via the device-treesource 
> "arch/arm/mach-spear6xx/Kconfig"
> + Supports ST SPEAr600 boards configured via the device-tree
>
>  config ARCH_SPEAR_AUTO
> depends on !ARCH_SPEAR13XX && !ARCH_SPEAR6XX

Acked-by: Viresh Kumar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCHv3 1/3] ASoC: codec: Simplify ASoC probe code.

2014-03-09 Thread li.xi...@freescale.com
> Subject: Re: [PATCHv3 1/3] ASoC: codec: Simplify ASoC probe code.
> 
> On Mon, Mar 03, 2014 at 07:24:36AM +, li.xi...@freescale.com wrote:
> 
> > > > /* Default to using ALC auto offset calibration mode. */
> > > > snd_soc_update_bits(codec, DA7213_ALC_CTRL1,
> > > > DA7213_ALC_CALIB_MODE_MAN, 0);
> 
> > > This one will fail.
> 
> > Sorry, I'm not very understand why will this fail ? Before the ASoC probe,
> > the ASoC core will set the I/O.
> > :)
> 
> OK, that's now been refactored.

@Mark, @Lars,

Has there any other problems about this patch series? And this I had tested on
our Vybrid-Twr board based on SGTL5000 codec and SAI drivers. If not, I can
continue with my second patches series about " Remove set_cache_io entirely from
ASoC probe".

Thanks very much.

--
Best Regards,
Xiubo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH net-next 0/7] r8152: tx/rx improvement

2014-03-09 Thread hayeswang
 David Miller [mailto:da...@davemloft.net] 
> Sent: Saturday, March 08, 2014 5:28 AM
> To: hayesw...@realtek.com
> Cc: net...@vger.kernel.org; nic_s...@realtek.com; 
> linux-kernel@vger.kernel.org; linux-...@vger.kernel.org
> Subject: Re: [PATCH net-next 0/7] r8152: tx/rx improvement
[...]
> Note that if you ever add ->ndo_poll_controller support to 
> this driver,
> you will have to revert your spin_lock_irq{save,restore}() changes to
> your ->ndo_start_xmit.
> 
> Because the transmit function can indeed be invoked from hard IRQ
> context once you support netpoll.

Thank you for your reminder. I would notice that.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] bridge: multicast: enable snooping on general queries only

2014-03-09 Thread Linus Lüssing
Without this check someone could easily create a denial of service
by injecting multicast-specific queries to enable the bridge
snooping part if no real querier issuing periodic general queries
is present on the link which would result in the bridge wrongly
shutting down ports for multicast traffic as the bridge did not learn
about these listeners.

With this patch the snooping code is enabled upon receiving valid,
general queries only.

Signed-off-by: Linus Lüssing 
---
 net/bridge/br_multicast.c |8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index c77f073..5dd4fec 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -1127,9 +1127,10 @@ static void br_multicast_query_received(struct 
net_bridge *br,
struct net_bridge_port *port,
struct bridge_mcast_querier *querier,
int saddr,
+   bool is_general_query,
unsigned long max_delay)
 {
-   if (saddr)
+   if (saddr && is_general_query)
br_multicast_update_querier_timer(br, querier, max_delay);
else if (timer_pending(>timer))
return;
@@ -1190,7 +1191,7 @@ static int br_ip4_multicast_query(struct net_bridge *br,
}
 
br_multicast_query_received(br, port, >ip4_querier, !!iph->saddr,
-   max_delay);
+   !group, max_delay);
 
if (!group)
goto out;
@@ -1282,7 +1283,8 @@ static int br_ip6_multicast_query(struct net_bridge *br,
}
 
br_multicast_query_received(br, port, >ip6_querier,
-   !ipv6_addr_any(>saddr), max_delay);
+   !ipv6_addr_any(>saddr),
+   is_general_query, max_delay);
 
if (!group)
goto out;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv8 1/4] power_supply: Add inlmt,iterm, min/max temp props

2014-03-09 Thread Jenny Tc
On Fri, Mar 07, 2014 at 09:12:40PM +0100, Pavel Machek wrote:
> On Fri 2014-03-07 10:59:31, Jenny TC wrote:
> > Add new power supply properties for input current, charge termination
> > current, min and max temperature
> > 
> > POWER_SUPPLY_PROP_TEMP_MIN - minimum operatable temperature
> > POWER_SUPPLY_PROP_TEMP_MAX - maximum operatable temperature
> > 
> > POWER_SUPPLY_PROP_INLMT - input current limit programmed by charger. 
> > Indicates
> > the input current for a charging source.
> > 
> > POWER_SUPPLY_PROP_CHARGE_TERM_CUR - Charge termination current used to 
> > detect
> > the end of charge condition
> > 
> > Signed-off-by: Jenny TC 
> 
> The patch no longer matches the description, otherwise it looks good.

Will fix it in next patch set
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] net: phy: Add sysfs attribute to prevent PHY suspend

2014-03-09 Thread David Miller
From: Sebastian Hesselbarth 
Date: Mon, 10 Mar 2014 01:53:33 +0100

> On 03/10/2014 01:41 AM, David Miller wrote:
>> From: Sebastian Hesselbarth 
>> Date: Mon, 10 Mar 2014 01:37:32 +0100
>>
>>> The mechanism is manual, no automatic way to determine it.
>>
>> We recognize BIOS and ACPI bugs and work around them, by looking at
>> version information and whatnot, so you really can't convince me that
>> something similar can't be done here perhaps in the platform code.
> 
> Hmm, if the is a way to determine the version of that particual u-boot
> I'd be happy to exploit that information. But I honestly doubt that.
> Compared to u-boot bootloader and kernel interaction, BIOS and ACPI
> are well-defined protocols.
> 
> I personally, would prefer everybody should update his broken
> bootloaders, but that will just not happen.

What you can do is have a test that _perhaps_ covers a "broader than
reality" list of broken bootloader cases.

Then you have something the bootloader can provide which indicates
that it has been fixed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] bridge: multicast: add sanity check for general query destination

2014-03-09 Thread Linus Lüssing
General IGMP and MLD queries are supposed to have the multicast
link-local all-nodes address as their destination according to RFC2236
section 9, RFC3376 section 4.1.12/9.1, RFC2710 section 8 and RFC3810
section 5.1.15.

Without this check, such malformed IGMP/MLD queries can result in a
denial of service: The queries are ignored by most IGMP/MLD listeners
therefore they will not respond with an IGMP/MLD report. However,
without this patch these malformed MLD queries would enable the
snooping part in the bridge code, potentially shutting down the
according ports towards these hosts for multicast traffic as the
bridge did not learn about these listeners.

Reported-by: Jan Stancek 
Signed-off-by: Linus Lüssing 
---
 net/bridge/br_multicast.c |   19 +++
 1 file changed, 19 insertions(+)

diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index fb0e36f..c77f073 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -1181,6 +1181,14 @@ static int br_ip4_multicast_query(struct net_bridge *br,
IGMPV3_MRC(ih3->code) * (HZ / IGMP_TIMER_SCALE) : 1;
}
 
+   /* RFC2236+RFC3376 (IGMPv2+IGMPv3) require the multicast link layer
+* all-systems destination addresses (224.0.0.1) for general queries
+*/
+   if (!group && iph->daddr != htonl(INADDR_ALLHOSTS_GROUP)) {
+   err = -EINVAL;
+   goto out;
+   }
+
br_multicast_query_received(br, port, >ip4_querier, !!iph->saddr,
max_delay);
 
@@ -1228,6 +1236,7 @@ static int br_ip6_multicast_query(struct net_bridge *br,
unsigned long max_delay;
unsigned long now = jiffies;
const struct in6_addr *group = NULL;
+   bool is_general_query;
int err = 0;
 
spin_lock(>multicast_lock);
@@ -1262,6 +1271,16 @@ static int br_ip6_multicast_query(struct net_bridge *br,
max_delay = max(msecs_to_jiffies(mldv2_mrc(mld2q)), 1UL);
}
 
+   is_general_query = group && ipv6_addr_any(group) ? true : false;
+
+   /* RFC2710+RFC3810 (MLDv1+MLDv2) require the multicast link layer
+* all-nodes destination address (ff02::1) for general queries
+*/
+   if (is_general_query && !ipv6_addr_is_ll_all_nodes(>daddr)) {
+   err = -EINVAL;
+   goto out;
+   }
+
br_multicast_query_received(br, port, >ip6_querier,
!ipv6_addr_any(>saddr), max_delay);
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] hung_task : check the value of "sysctl_hung_task_timeout_sec"

2014-03-09 Thread Liu hua
on 2014/3/6 23:35, Paul Gortmaker wrote:
> On 14-03-06 02:19 AM, Liu hua wrote:
>> As sysctl_hung_task_timeout_sec is unsigned long, when this value is
>> larger then LONG_MAX, the function schedule_timeout_interruptible in
>> watchdog will return immediately without sleep :
>>
>> for example (in x86_64 platform):
>>
>> linux# echo 0x > /proc/sys/kernel/hung_task_timeout_secs
>>
>> [   66.798350] schedule_timeout: wrong timeout value ff06
>> [   66.800064] schedule_timeout: wrong timeout value ff06
>> [   66.801774] schedule_timeout: wrong timeout value ff06
>> [   66.803488] schedule_timeout: wrong timeout value ff06
>> [   66.805225] schedule_timeout: wrong timeout value ff06
>>
>> The screen was filled with "schedule_timeout: wrong timeout value
>> ff06" and the system stalled.
>>
>> So I do some check and correction in timeout_jiffies, to let the function
>> schedule_timeout_interruptible allways get the valid parameter.
>>
>> Signed-off-by: Liu Hua 
>> ---
>>  kernel/hung_task.c | 11 ++-
>>  1 file changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
>> index 06bb141..ef96650 100644
>> --- a/kernel/hung_task.c
>> +++ b/kernel/hung_task.c
>> @@ -186,7 +186,16 @@ static void check_hung_uninterruptible_tasks(unsigned 
>> long timeout)
>>  static unsigned long timeout_jiffies(unsigned long timeout)
>>  {
>>  /* timeout of 0 will disable the watchdog */
> 
> 
> You are breaking the above functionality/feature by declaring
> zero invalid.
> 
> Paul.
> --
> 
Actually the patch will disable the watchdog if the timeout is illegal(except 
0) for
schedule_timeout_interruptible.
I will make a new patch that disables the watchdog when the timeout is 0 or 
above
LONG_MAX without printing errors ?

What do you think?

Liu Hua
>> -return timeout ? timeout * HZ : MAX_SCHEDULE_TIMEOUT;
>> +if ((timeout == 0) || (timeout > MAX_SCHEDULE_TIMEOUT)) {
>> +pr_err("%s : wrong timeout value %lx\n",
>> +__func__, timeout);
>> +pr_err("Timeout value is set to MAX_SCHEDULE_TIMEOUT(%lx) 
>> now.\n",
>> +MAX_SCHEDULE_TIMEOUT);
>> +return MAX_SCHEDULE_TIMEOUT;
>> +}
>> +
>> +return (timeout * HZ) < MAX_SCHEDULE_TIMEOUT ?
>> +timeout * HZ : MAX_SCHEDULE_TIMEOUT;
>>  }
>>
>>  /*
>>
> 
> .
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [x86, vdso] BUG: unable to handle kernel paging request at d34bd000

2014-03-09 Thread Andy Lutomirski
On Sun, Mar 9, 2014 at 5:16 PM, H. Peter Anvin  wrote:
> On 03/09/2014 12:47 AM, Stefani Seibold wrote:
>>
>> But let me ask an other question: Is the compat mode still needed
>> anymore?
>>
>> Since Lguest, XEN, OPLC and the reservetop kernel parameter will change
>> the __FIXADDR_TOP, there is no fix place for the VDSO page. Also in the
>> 32 bit emulation layer the address is not fix.
>>
>> So all applications can fail when try directly access the VDSO page with
>> a hard coded address 0xe000.
>>
>> IMHO this is broken. So an other solution is to remove the whole VDSO
>> compat code.
>>
>
> Lguest, Xen, OLPC and reservetop are corner cases.  My understanding is
> that at least one widely used distro actually cared about this, and
> Linus especially is adamant that "we don't break userspace."

OK, I did some research.  I think that the commit that fixed the glibc bug was:

commit 49ad572a70b8aeb91e57483a11dd1b77e31c4468
Author: Ulrich Drepper 
Date:   Sat Feb 28 17:56:22 2004 +

Update.

* elf/rtld.c (dl_main): Adjust l->l_ld of the vDSO by l->l_addr.
* sysdeps/generic/dl-sysdep.c (_dl_sysdep_start): Only set
GL(dl_sysinfo) if non-zero.

I don't think that the actual load address of the VDSO matters at all.
 Here's what I think is going on:

When the kernel is built, vdso32-int80.so looks like this (excerpted
from objdump -T):

DYNAMIC SYMBOL TABLE:
0420 gDF .text  0003  LINUX_2.5   __kernel_vsyscall
 gDO *ABS*    LINUX_2.5   LINUX_2.5
0410 gDF .text  0008  LINUX_2.5   __kernel_rt_sigreturn
0400 gDF .text  0009  LINUX_2.5   __kernel_sigreturn

When the kernel is run, the kernel "relocates" the vdso, generating
something more like:

DYNAMIC SYMBOL TABLE:
e420 gDF .text  0014  LINUX_2.5   __kernel_vsyscall
 gDO *ABS*    LINUX_2.5   LINUX_2.5
e410 gDF .text  0008  LINUX_2.5   __kernel_rt_sigreturn
e400 gDF .text  0009  LINUX_2.5   __kernel_sigreturn

That magic 0xe000 offset comes from VDSO_HIGH_BASE - VDSO_PRELINK,
and VDSO_PRELINK seems like an amazingly complicated way to say
"zero".

Before the fix, it looks like glibc couldn't handle a vdso that was
mapped in such a way that its ELF headers didn't match its actual
location.  Now it can.  This is borne out by this message:

commit d4f7a2c18e59e0304a1c733589ce14fc02fec1bd
Author: Jeremy Fitzhardinge 
Date:   Wed May 2 19:27:12 2007 +0200

[PATCH] i386: Relocate VDSO ELF headers to match mapped location with COMPAT

Some versions of libc can't deal with a VDSO which doesn't have its
ELF headers matching its mapped address.  COMPAT_VDSO maps the VDSO at
a specific system-wide fixed address.  Previously this was all done at
build time, on the grounds that the fixed VDSO address is always at
the top of the address space.  However, a hypervisor may reserve some
of that address space, pushing the fixmap address down.

I suspect that it's entirely safe to map the 32-bit vdso wherever the
hell we want, so long as it's relocated to match the actual mapping
address.  In principle it could even live outside the fixmap, as long
as the actual binary that gets run doesn't end up on top of it.

So... I propose that we get rid of all the madness.  Fix the vdso32
setup code to stop being insane.  That means: stop memcpying the vdso
image anywhere and get rid of all references to the magical and wrong
number "3".  Just map it wherever it needs to be mapped and relocate
the damn think *in place*.  If some RODATA crud gets in the way,
twiddle the protection bits as needed.  That means that all this
"vvars before vdso" nonsense can go away.

(Of course, I haven't the faintest idea what l_addr in glibc means.
If there was a way to arrange for l_addr to be zero, then maybe none
of this would matter.  Hmm, I wonder if just not relocating the vdso
at all would have the desired effect.  Anyone out there understand
glibc?)

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] namespaces fixes for 3.14-rcX

2014-03-09 Thread Eric W. Biederman

Linus,

Please pull the for-linus branch from the git tree:

   git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git 
for-linus

   HEAD: d211f177b28ec070c25b3d0b960aa55f352f731f audit: Update kdoc for 
audit_send_reply and audit_list_rules_send

Starting with 3.14-rc1 the audit code is faulty (think oopses and races)
with respect to how it computes the network namespace of which socket to
reply to, and I happened to notice by chance when reading through the
code.

My efforts to get these fixes noticed by people who care about audit
seem to have landed on deaf ears, so since these are namespace related I
have put them in my tree.

My testing and the automated build bots don't find any problems with
these fixes.

Eric W. Biederman (3):
  audit: Use struct net not pid_t to remember the network namespce to reply 
in
  audit: Send replies in the proper network namespace.
  audit: Update kdoc for audit_send_reply and audit_list_rules_send

 include/linux/audit.h |3 ++-
 kernel/audit.c|   31 ---
 kernel/audit.h|2 +-
 kernel/auditfilter.c  |   10 +++---
 4 files changed, 26 insertions(+), 20 deletions(-)

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re:

2014-03-09 Thread inforbonus
Your Reference Es/2012 YC-EU/14 Contact Dr. Marc Alvaro
for  clarification and claim of 850.000.00 EUR. Tel: +34 634 161 422
E-mail: caixas...@administrativos.com

Regards
Doña Maria Gomez
General Secretary fndo)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re:

2014-03-09 Thread inforbonus
Your Reference Es/2012 YC-EU/14 Contact Dr. Marc Alvaro
for  clarification and claim of 850.000.00 EUR. Tel: +34 634 161 422
E-mail: caixas...@administrativos.com

Regards
Doña Maria Gomez
General Secretary fndo)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net V2] vhost: net: switch to use data copy if pending DMAs exceed the limit

2014-03-09 Thread Qin Chuanyu

On 2014/3/7 13:28, Jason Wang wrote:

We used to stop the handling of tx when the number of pending DMAs
exceeds VHOST_MAX_PEND. This is used to reduce the memory occupation
of both host and guest. But it was too aggressive in some cases, since
any delay or blocking of a single packet may delay or block the guest
transmission. Consider the following setup:

 +-++-+
 | VM1 || VM2 |
 +--+--++--+--+
|  |
 +--+--++--+--+
 | tap0|| tap1|
 +--+--++--+--+
|  |
 pfifo_fast   htb(10Mbit/s)
|  |
 +--+--+---+
 | bridge  |
 +--+--+
|
 pfifo_fast
|
 +-+
 | eth0|(100Mbit/s)
 +-+

- start two VMs and connect them to a bridge
- add an physical card (100Mbit/s) to that bridge
- setup htb on tap1 and limit its throughput to 10Mbit/s
- run two netperfs in the same time, one is from VM1 to VM2. Another is
   from VM1 to an external host through eth0.
- result shows that not only the VM1 to VM2 traffic were throttled but
   also the VM1 to external host through eth0 is also throttled somehow.

This is because the delay added by htb may lead the delay the finish
of DMAs and cause the pending DMAs for tap0 exceeds the limit
(VHOST_MAX_PEND). In this case vhost stop handling tx request until
htb send some packets. The problem here is all of the packets
transmission were blocked even if it does not go to VM2.

We can solve this issue by relaxing it a little bit: switching to use
data copy instead of stopping tx when the number of pending DMAs
exceed half of the vq size. This is safe because:

- The number of pending DMAs were still limited (half of the vq size)
- The out of order completion during mode switch can make sure that
   most of the tx buffers were freed in time in guest.

So even if about 50% packets were delayed in zero-copy case, vhost
could continue to do the transmission through data copy in this case.

Test result:

Before this patch:
VM1 to VM2 throughput is 9.3Mbit/s
VM1 to External throughput is 40Mbit/s
CPU utilization is 7%

After this patch:
VM1 to VM2 throughput is 9.3Mbit/s
Vm1 to External throughput is 93Mbit/s
CPU utilization is 16%

Completed performance test on 40gbe shows no obvious changes in both
throughput and cpu utilization with this patch.

The patch only solve this issue when unlimited sndbuf. We still need a
solution for limited sndbuf.

Cc: Michael S. Tsirkin 
Cc: Qin Chuanyu 
Signed-off-by: Jason Wang 
---
Changes from V1:
- Remove VHOST_MAX_PEND and switch to use half of the vq size as the limit
- Add cpu utilization in commit log
---
  drivers/vhost/net.c | 19 +++
  1 file changed, 7 insertions(+), 12 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index a0fa5de..2925e9a 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -38,8 +38,6 @@ MODULE_PARM_DESC(experimental_zcopytx, "Enable Zero Copy TX;"
   * Using this limit prevents one virtqueue from starving others. */
  #define VHOST_NET_WEIGHT 0x8

-/* MAX number of TX used buffers for outstanding zerocopy */
-#define VHOST_MAX_PEND 128
  #define VHOST_GOODCOPY_LEN 256

  /*
@@ -345,7 +343,7 @@ static void handle_tx(struct vhost_net *net)
.msg_flags = MSG_DONTWAIT,
};
size_t len, total_len = 0;
-   int err;
+   int err, num_pends;
size_t hdr_size;
struct socket *sock;
struct vhost_net_ubuf_ref *uninitialized_var(ubufs);
@@ -366,13 +364,6 @@ static void handle_tx(struct vhost_net *net)
if (zcopy)
vhost_zerocopy_signal_used(net, vq);

-   /* If more outstanding DMAs, queue the work.
-* Handle upend_idx wrap around
-*/
-   if (unlikely((nvq->upend_idx + vq->num - VHOST_MAX_PEND)
- % UIO_MAXIOV == nvq->done_idx))
-   break;
-
head = vhost_get_vq_desc(>dev, vq, vq->iov,
 ARRAY_SIZE(vq->iov),
 , ,
@@ -405,9 +396,13 @@ static void handle_tx(struct vhost_net *net)
break;
}

+   num_pends = likely(nvq->upend_idx >= nvq->done_idx) ?
+   (nvq->upend_idx - nvq->done_idx) :
+   (nvq->upend_idx + UIO_MAXIOV -
+nvq->done_idx);
+
zcopy_used = zcopy && len >= VHOST_GOODCOPY_LEN
-  && (nvq->upend_idx + 1) % UIO_MAXIOV !=
- nvq->done_idx
+  && num_pends <= vq->num >> 1
   && vhost_net_tx_select_zcopy(net);

/* use msg_control to pass vhost zerocopy ubuf info to skb */


Reviewed-by: Qin 

Re: [PATCH] net: phy: fix uninitalized WOL parameters in phy_ethtool_get_wol

2014-03-09 Thread Ben Hutchings
On Mon, 2014-03-10 at 02:01 +0100, Sebastian Hesselbarth wrote:
> phy_ethtool_get_wol is a helper to get current WOL settings from
> a phy device. When using this helper on a PHY without .get_wol
> callback, struct ethtool_wolinfo is never set-up correctly and
> may contain misleading information about WOL status.
> 
> To fix this, always zero relevant fields of struct ethtool_wolinfo
> regardless of .get_wol callback availability.

I think it's the caller's responsibility to zero out struct
ethtool_wolinfo.  That is what ethtool_get_wol() does.

Maybe you could split ethtool_get_wol() like we did
ethtool_get_settings(), to support in-kernel invocation of ETHTOOL_GWOL?

Ben.

> Signed-off-by: Sebastian Hesselbarth 
> ---
> Cc: David Miller 
> Cc: Florian Fainelli 
> Cc: net...@vger.kernel.org
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  drivers/net/phy/phy.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
> index 19c9eca0ef26..62a7cd401e1c 100644
> --- a/drivers/net/phy/phy.c
> +++ b/drivers/net/phy/phy.c
> @@ -1092,6 +1092,7 @@ EXPORT_SYMBOL(phy_ethtool_set_wol);
>  
>  void phy_ethtool_get_wol(struct phy_device *phydev, struct ethtool_wolinfo 
> *wol)
>  {
> + wol->supported = wol->wolopts = 0;
>   if (phydev->drv->get_wol)
>   phydev->drv->get_wol(phydev, wol);
>  }

-- 
Ben Hutchings
Computers are not intelligent.  They only think they are.


signature.asc
Description: This is a digitally signed message part


Re: [RFC PATCH 6/8] ACPI: use platform bus as the default bus for _HID enumeration

2014-03-09 Thread Zhang Rui
On Sun, 2014-03-09 at 19:04 +0100, Rafael J. Wysocki wrote:
> On Sunday, March 09, 2014 11:50:37 PM Zhang Rui wrote:
> > On Wed, 2014-02-26 at 17:11 +0800, Zhang Rui wrote:
> > > Because of the growing demand for enumerating ACPI devices to platform 
> > > bus,
> > > this patch changes the code to enumerate ACPI devices with _HID/_CID to
> > > platform bus by default, unless the device already has a scan handler 
> > > attached.
> > > 
> > > Signed-off-by: Zhang Rui 
> > > ---
> > >  drivers/acpi/acpi_platform.c |   28 
> > >  drivers/acpi/scan.c  |   12 ++--
> > >  2 files changed, 6 insertions(+), 34 deletions(-)
> > > 
> > > diff --git a/drivers/acpi/acpi_platform.c b/drivers/acpi/acpi_platform.c
> > > index dbfe49e..33376a9 100644
> > > --- a/drivers/acpi/acpi_platform.c
> > > +++ b/drivers/acpi/acpi_platform.c
> > > @@ -22,24 +22,6 @@
> > >  
> > >  ACPI_MODULE_NAME("platform");
> > >  
> > > -/*
> > > - * The following ACPI IDs are known to be suitable for representing as
> > > - * platform devices.
> > > - */
> > > -static const struct acpi_device_id acpi_platform_device_ids[] = {
> > > -
> > > - { "PNP0D40" },
> > > - { "ACPI0003" },
> > > - { "VPC2004" },
> > > - { "BCM4752" },
> > > -
> > > - /* Intel Smart Sound Technology */
> > > - { "INT33C8" },
> > > - { "80860F28" },
> > > -
> > > - { }
> > > -};
> > > -
> > >  /**
> > >   * acpi_create_platform_device - Create platform device for ACPI device 
> > > node
> > >   * @adev: ACPI device node to create a platform device for.
> > > @@ -125,13 +107,3 @@ int acpi_create_platform_device(struct acpi_device 
> > > *adev,
> > >   kfree(resources);
> > >   return 1;
> > >  }
> > > -
> > > -static struct acpi_scan_handler platform_handler = {
> > > - .ids = acpi_platform_device_ids,
> > > - .attach = acpi_create_platform_device,
> > > -};
> > > -
> > > -void __init acpi_platform_init(void)
> > > -{
> > > - acpi_scan_add_handler(_handler);
> > > -}
> > > diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> > > index 5967338..61af32e 100644
> > > --- a/drivers/acpi/scan.c
> > > +++ b/drivers/acpi/scan.c
> > > @@ -2022,14 +2022,15 @@ static int acpi_scan_attach_handler(struct 
> > > acpi_device *device)
> > >   handler = acpi_scan_match_handler(hwid->id, );
> > >   if (handler) {
> > >   ret = handler->attach(device, devid);
> > > - if (ret > 0) {
> > > + if (ret > 0)
> > >   device->handler = handler;
> > > - break;
> > > - } else if (ret < 0) {
> > > - break;
> > > - }
> > > + if (ret)
> > > + goto end;
> > >   }
> > >   }
> > > +end:
> > > + if (!list_empty(>pnp.ids) && !device->handler)
> > > + acpi_create_platform_device(device, NULL);
> > 
> > I just found a big problem in this proposal, which affects all the
> > optional scan handlers.
> 
> What do you mean by "optional"?  Such that can be compiled out?
> 
yes.

> > The problem is that, if we disable a scan handler, platform device nodes
> > would be created instead by the code above, because there is no scan
> > handler attached for those ACPI nodes.
> 
> If "we disable a scan handled" means "we compile it out", I'm not sure
> why creating platform devices for the device objects in question will
> be incorrect?
> 
take lpss scan handler and 80860F0A device for example,
acpi_lpss_create_device() would invoke lpss_uart_setup() first and then
register 80860F0A as a platform device.
if the lpss scan handler is compiled out, we would do nothing but
register a platform device directly, thus the dw8250_platform_driver
driver is still loaded, but probably breaks.

IMO, we should either have a full functional platform device (if the
scan handler is compiled in) or nothing (if the scan handler is compiled
out).

thanks,
rui
> Rafael
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 6/8] ACPI: use platform bus as the default bus for _HID enumeration

2014-03-09 Thread Zhang Rui
On Sun, 2014-03-09 at 19:04 +0100, Rafael J. Wysocki wrote:
> On Sunday, March 09, 2014 11:50:37 PM Zhang Rui wrote:
> > On Wed, 2014-02-26 at 17:11 +0800, Zhang Rui wrote:
> > > Because of the growing demand for enumerating ACPI devices to platform 
> > > bus,
> > > this patch changes the code to enumerate ACPI devices with _HID/_CID to
> > > platform bus by default, unless the device already has a scan handler 
> > > attached.
> > > 
> > > Signed-off-by: Zhang Rui 
> > > ---
> > >  drivers/acpi/acpi_platform.c |   28 
> > >  drivers/acpi/scan.c  |   12 ++--
> > >  2 files changed, 6 insertions(+), 34 deletions(-)
> > > 
> > > diff --git a/drivers/acpi/acpi_platform.c b/drivers/acpi/acpi_platform.c
> > > index dbfe49e..33376a9 100644
> > > --- a/drivers/acpi/acpi_platform.c
> > > +++ b/drivers/acpi/acpi_platform.c
> > > @@ -22,24 +22,6 @@
> > >  
> > >  ACPI_MODULE_NAME("platform");
> > >  
> > > -/*
> > > - * The following ACPI IDs are known to be suitable for representing as
> > > - * platform devices.
> > > - */
> > > -static const struct acpi_device_id acpi_platform_device_ids[] = {
> > > -
> > > - { "PNP0D40" },
> > > - { "ACPI0003" },
> > > - { "VPC2004" },
> > > - { "BCM4752" },
> > > -
> > > - /* Intel Smart Sound Technology */
> > > - { "INT33C8" },
> > > - { "80860F28" },
> > > -
> > > - { }
> > > -};
> > > -
> > >  /**
> > >   * acpi_create_platform_device - Create platform device for ACPI device 
> > > node
> > >   * @adev: ACPI device node to create a platform device for.
> > > @@ -125,13 +107,3 @@ int acpi_create_platform_device(struct acpi_device 
> > > *adev,
> > >   kfree(resources);
> > >   return 1;
> > >  }
> > > -
> > > -static struct acpi_scan_handler platform_handler = {
> > > - .ids = acpi_platform_device_ids,
> > > - .attach = acpi_create_platform_device,
> > > -};
> > > -
> > > -void __init acpi_platform_init(void)
> > > -{
> > > - acpi_scan_add_handler(_handler);
> > > -}
> > > diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> > > index 5967338..61af32e 100644
> > > --- a/drivers/acpi/scan.c
> > > +++ b/drivers/acpi/scan.c
> > > @@ -2022,14 +2022,15 @@ static int acpi_scan_attach_handler(struct 
> > > acpi_device *device)
> > >   handler = acpi_scan_match_handler(hwid->id, );
> > >   if (handler) {
> > >   ret = handler->attach(device, devid);
> > > - if (ret > 0) {
> > > + if (ret > 0)
> > >   device->handler = handler;
> > > - break;
> > > - } else if (ret < 0) {
> > > - break;
> > > - }
> > > + if (ret)
> > > + goto end;
> > >   }
> > >   }
> > > +end:
> > > + if (!list_empty(>pnp.ids) && !device->handler)
> > > + acpi_create_platform_device(device, NULL);
> > 
> > I just found a big problem in this proposal, which affects all the
> > optional scan handlers.
> 
> What do you mean by "optional"?  Such that can be compiled out?
> 
yes.

> > The problem is that, if we disable a scan handler, platform device nodes
> > would be created instead by the code above, because there is no scan
> > handler attached for those ACPI nodes.
> 
> If "we disable a scan handled" means "we compile it out", I'm not sure
> why creating platform devices for the device objects in question will
> be incorrect?
> 
take lpss scan handler and 80860F0A device for example,
acpi_lpss_create_device() would invoke lpss_uart_setup() first and then
register 80860F0A as a platform device.
if the lpss scan handler is compiled out, we would do nothing but
register a platform device directly, thus the dw8250_platform_driver
driver is still loaded, but probably breaks.

IMO, we should either have a full functional platform device (if the
scan handler is compiled in) or nothing (if the scan handler is compiled
out).

thanks,
rui
> Rafael
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] drm/fb-helper: Do the 'max_conn_count' zero check

2014-03-09 Thread Xiubo Li
Since we cannot make sure the 'max_conn_count' will always be none
zero from the users, and then if max_conn_count equals to zero, the
kcalloc() will return ZERO_SIZE_PTR, which equals to ((void *)16).

So this patch fix this with just doing the 'max_conn_count' zero check
in the front of drm_fb_helper_init().

Signed-off-by: Xiubo Li 
CC: Jani Nikula 
---
 drivers/gpu/drm/drm_fb_helper.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index 3d13ca6e2..a0d286c 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -514,6 +514,9 @@ int drm_fb_helper_init(struct drm_device *dev,
struct drm_crtc *crtc;
int i;
 
+   if (!max_conn_count)
+   return -EINVAL;
+
fb_helper->dev = dev;
 
INIT_LIST_HEAD(_helper->kernel_fb_list);
-- 
1.8.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] x86 fixes for v3.14-rc6 - bad patch dropped

2014-03-09 Thread H. Peter Anvin
On 03/09/2014 06:51 PM, Linus Torvalds wrote:
> On Sun, Mar 9, 2014 at 5:15 PM, H. Peter Anvin  wrote:
>>
>> The same collection of fixes except the broken NMI patch dropped.  I
>> will send a fixed version of that plus Suresh' FPU fix in a few days,
>> to get them some testing, plus I will be on a trip (part of why I got
>> unduly rushed this past Friday.  Sorry again for that.)
> 
> I already pulled the broken one, and had actually pushed it out in
> that broken format, which was so upsetting. Normally I tend to delay
> pushing things out until I've done my compile tests, but I had
> stupidly thought I didn't need to that time.
> 

So I see.  Thanks and sorry again.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] it87_wdt: Work around non-working CIR interrupts

2014-03-09 Thread Guenter Roeck

On 03/06/2014 01:36 AM, Marc van der Wal wrote:

From: Marc van der Wal 

On some hardware platforms, the it87_wdt watchdog resets the machine
despite the watchdog daemon running and writing to /dev/watchdog.

This is due to Consumer IR buffer underrun interrupts being used as
triggers to reset the timer.  On some buggy hardware implementations
such as the iEi AFL-12A-N270 single-board computer, this method does
not work.

However, resetting the timer by writing its original timeout value in
its configuration register over and over again suppresses the unwanted
reboots.

Add a module option (nocir), 0 by default in order not to break existing
setups.  Setting it to 1 enables the workaround.

Fixes bug #42801 .
Tested primarily on Linux 3.5.7, applies cleanly on Linux 3.13.5.

Signed-off-by: Marc van der Wal 


Reviewed-by: Guenter Roeck 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [f2fs-dev] [PATCH 1/5] f2fs: update start nid only once each circle

2014-03-09 Thread Jaegeuk Kim
Hi all,

I'll handle them all by myself.
Thank you for the contribution. :)

2014-03-10 (월), 09:32 +0800, Gu Zheng:
> On 03/08/2014 07:46 PM, Chao Yu wrote:
> 
> > Hi Gu,
> > 
> >> -Original Message-
> >> From: Gu Zheng [mailto:guz.f...@cn.fujitsu.com]
> >> Sent: Friday, March 07, 2014 6:43 PM
> >> To: Kim
> >> Cc: linux-kernel; f2fs
> >> Subject: [f2fs-dev] [PATCH 1/5] f2fs: update start nid only once each 
> >> circle
> >>
> >>
> >> Signed-off-by: Gu Zheng 
> > 
> > Reviewed-by: Chao Yu 
> > 
> >> ---
> >>  fs/f2fs/node.c |6 +-
> >>  1 files changed, 5 insertions(+), 1 deletions(-)
> >>
> >> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> >> index 8c14110..9653096 100644
> >> --- a/fs/f2fs/node.c
> >> +++ b/fs/f2fs/node.c
> >> @@ -1875,11 +1875,15 @@ void destroy_node_manager(struct f2fs_sb_info *sbi)
> >>while ((found = __gang_lookup_nat_cache(nm_i,
> >>nid, NATVEC_SIZE, natvec))) {
> >>unsigned idx;
> >> +
> >> +  nid = nat_get_nid(natvec[found - 1]) + 1;
> >> +
> >>for (idx = 0; idx < found; idx++) {
> >>struct nat_entry *e = natvec[idx];
> > 
> > Could we replace argument 'e' with 'natvec[idx]'? then we could remove 'e' 
> > and
> > brace here.
> 
> Agree. More neat with this cleanup.
> 
> Regards,
> Gu
> 
> > 
> > Thanks.
> > 
> >> -  nid = nat_get_nid(e) + 1;
> >> +
> >>__del_from_nat_cache(nm_i, e);
> >>}
> >> +
> >>}
> >>f2fs_bug_on(nm_i->nat_cnt);
> >>write_unlock(_i->nat_tree_lock);
> >> --
> >> 1.7.7
> >>
> >>
> >> --
> >> Subversion Kills Productivity. Get off Subversion & Make the Move to 
> >> Perforce.
> >> With Perforce, you get hassle-free workflows. Merge that actually works.
> >> Faster operations. Version large binaries.  Built-in WAN optimization and 
> >> the
> >> freedom to use Git, Perforce or both. Make the move to Perforce.
> >> http://pubads.g.doubleclick.net/gampad/clk?id=122218951=/4140/ostg.clktrk
> >> ___
> >> Linux-f2fs-devel mailing list
> >> linux-f2fs-de...@lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> > 
> > 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Jaegeuk Kim
Samsung

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3] shdma: add R-Car Audio DMAC peri peri driver

2014-03-09 Thread Kuninori Morimoto
From: Kuninori Morimoto 

Add support Audio DMAC peri peri driver
for Renesas R-Car Gen2 SoC, using 'shdma-base'
DMA driver framework.

Signed-off-by: Kuninori Morimoto 
---
v2 -> v3

 - remove error message when devm_kzalloc() was failed

 drivers/dma/sh/Kconfig |6 +
 drivers/dma/sh/Makefile|1 +
 drivers/dma/sh/rcar-audmapp.c  |  321 
 include/linux/platform_data/dma-rcar-audmapp.h |   34 +++
 4 files changed, 362 insertions(+)
 create mode 100644 drivers/dma/sh/rcar-audmapp.c
 create mode 100644 include/linux/platform_data/dma-rcar-audmapp.h

diff --git a/drivers/dma/sh/Kconfig b/drivers/dma/sh/Kconfig
index dadd9e01..b4c8138 100644
--- a/drivers/dma/sh/Kconfig
+++ b/drivers/dma/sh/Kconfig
@@ -29,6 +29,12 @@ config RCAR_HPB_DMAE
help
  Enable support for the Renesas R-Car series DMA controllers.
 
+config RCAR_AUDMAC_PP
+   tristate "Renesas R-Car Audio DMAC Peripheral Peripheral support"
+   depends on SH_DMAE_BASE
+   help
+ Enable support for the Renesas R-Car Audio DMAC Peripheral Peripheral 
controllers.
+
 config SHDMA_R8A73A4
def_bool y
depends on ARCH_R8A73A4 && SH_DMAE != n
diff --git a/drivers/dma/sh/Makefile b/drivers/dma/sh/Makefile
index e856af2..1ce88b2 100644
--- a/drivers/dma/sh/Makefile
+++ b/drivers/dma/sh/Makefile
@@ -7,3 +7,4 @@ endif
 shdma-objs := $(shdma-y)
 obj-$(CONFIG_SUDMAC) += sudmac.o
 obj-$(CONFIG_RCAR_HPB_DMAE) += rcar-hpbdma.o
+obj-$(CONFIG_RCAR_AUDMAC_PP) += rcar-audmapp.o
diff --git a/drivers/dma/sh/rcar-audmapp.c b/drivers/dma/sh/rcar-audmapp.c
new file mode 100644
index 000..884ad3b80
--- /dev/null
+++ b/drivers/dma/sh/rcar-audmapp.c
@@ -0,0 +1,321 @@
+/*
+ * drivers/dma/sh/rcar-audmapp.c
+ *
+ * Copyright (C) 2013 Renesas Electronics Corporation
+ * Copyright (C) 2013 Kuninori Morimoto 
+ *
+ * based on the drivers/dma/sh/shdma.c
+ *
+ * Copyright (C) 2011-2012 Guennadi Liakhovetski 
+ * Copyright (C) 2009 Nobuhiro Iwamatsu 
+ * Copyright (C) 2009 Renesas Solutions, Inc. All rights reserved.
+ * Copyright (C) 2007 Freescale Semiconductor, Inc. All rights reserved.
+ *
+ * This is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * DMA register
+ */
+#define PDMASAR0x00
+#define PDMADAR0x04
+#define PDMACHCR   0x0c
+
+/* PDMACHCR */
+#define PDMACHCR_DE(1 << 0)
+
+#define AUDMAPP_MAX_CHANNELS   29
+
+/* Default MEMCPY transfer size = 2^2 = 4 bytes */
+#define LOG2_DEFAULT_XFER_SIZE 2
+#define AUDMAPP_SLAVE_NUMBER   256
+#define AUDMAPP_LEN_MAX(16 * 1024 * 1024)
+
+struct audmapp_chan {
+   struct shdma_chan shdma_chan;
+   struct audmapp_slave_config *config;
+   void __iomem *base;
+};
+
+struct audmapp_device {
+   struct shdma_dev shdma_dev;
+   struct audmapp_pdata *pdata;
+   struct device *dev;
+   void __iomem *chan_reg;
+};
+
+#define to_chan(chan) container_of(chan, struct audmapp_chan, shdma_chan)
+#define to_dev(chan) container_of(chan->shdma_chan.dma_chan.device,\
+ struct audmapp_device, shdma_dev.dma_dev)
+
+static void audmapp_write(struct audmapp_chan *auchan, u32 data, u32 reg)
+{
+   struct audmapp_device *audev = to_dev(auchan);
+   struct device *dev = audev->dev;
+
+   dev_dbg(dev, "w %p : %08x\n", auchan->base + reg, data);
+
+   iowrite32(data, auchan->base + reg);
+}
+
+static u32 audmapp_read(struct audmapp_chan *auchan, u32 reg)
+{
+   return ioread32(auchan->base + reg);
+}
+
+static void audmapp_halt(struct shdma_chan *schan)
+{
+   struct audmapp_chan *auchan = to_chan(schan);
+   int i;
+
+   audmapp_write(auchan, 0, PDMACHCR);
+
+   for(i = 0; i < 1024; i++) {
+   if (0 == audmapp_read(auchan, PDMACHCR))
+   return;
+   udelay(1);
+   }
+}
+
+static void audmapp_start_xfer(struct shdma_chan *schan,
+  struct shdma_desc *sdecs)
+{
+   struct audmapp_chan *auchan = to_chan(schan);
+   struct audmapp_device *audev = to_dev(auchan);
+   struct audmapp_slave_config *cfg = auchan->config;
+   struct device *dev = audev->dev;
+   u32 chcr = cfg->chcr | PDMACHCR_DE;
+
+   dev_dbg(dev, "src/dst/chcr = %x/%x/%x\n",
+   cfg->src, cfg->dst, cfg->chcr);
+
+   audmapp_write(auchan, cfg->src, PDMASAR);
+   audmapp_write(auchan, cfg->dst, PDMADAR);
+   audmapp_write(auchan, chcr, PDMACHCR);
+}
+
+static struct audmapp_slave_config *
+audmapp_find_slave(struct audmapp_chan *auchan, int slave_id)
+{
+   struct audmapp_device 

Re: [RFC PATCH 2/8] PNPACPI: use whilte list for pnpacpi device enumeration

2014-03-09 Thread Zhang Rui
On Sun, 2014-03-09 at 18:49 +0100, Rafael J. Wysocki wrote:
> On Sunday, March 09, 2014 01:29:30 PM Zhang Rui wrote:
> > On Fri, 2014-03-07 at 02:44 +0100, Rafael J. Wysocki wrote:
> > > On Wednesday, February 26, 2014 05:11:08 PM Zhang Rui wrote:
> > > > +
> > > > +static int __init acpi_pnp_scan_handler_attach(struct acpi_device 
> > > > *adev,
> > > 
> > > This can't be __init.
> > > 
> > > > +   const struct acpi_device_id *id)
> > > > +{
> > > > +   return 1;
> > > > +}
> > > > +
> > > > +static int __init acpi_pnp_scan_handler_match(char *devid, char 
> > > > *handlerid)
> > > 
> > > And this too.
> > > 
> > > > +{
> > > > +   int i;
> > > > +
> > > > +   if (!ispnpidacpi(devid))
> > > > +   return 0;
> > > > +
> > > > +   if (memcmp(devid, handlerid, 3))
> > > > +   return 0;
> > > > +
> > > > +for (i = 3; i < 7; i++) {
> > > > +if (handlerid[i] != 'X' &&
> > > > +   toupper(devid[i]) != toupper(handlerid[i]))
> > > > +return 0;
> > > > +}
> > > > +   return 1;
> > > > +}
> > > > +
> > > > +static struct acpi_scan_handler pnpacpi_handler __initdata = {
> > > 
> > > And this cannot be __initdata, because the list of ACPI scan handlers is
> > > walked during hotplug.
> > > 
> > right. will update it in next version.
> > 
> > > > +   .ids = acpi_pnp_device_ids,
> > > > +   .match = acpi_pnp_scan_handler_match,
> > > > +   .attach = acpi_pnp_scan_handler_attach,
> > > > +};
> > > > +
> > > > +void __init acpi_pnp_init(void)
> > > > +{
> > > > +   acpi_scan_add_handler(_handler);
> > > > +}
> > > > +
> > > > +static acpi_status __init pnpacpi_add_device_handler(acpi_handle 
> > > > handle,
> > > > +u32 lvl, void 
> > > > *context,
> > > > +void **rv)
> > > > +{
> > > > +   struct acpi_device *device;
> > > > +
> > > > +   if (acpi_bus_get_device(handle, ))
> > > > +   return AE_CTRL_DEPTH;
> > > > +   if (device->handler == _handler)
> > > > +   pnpacpi_add_device(device);
> > > 
> > > Why don't you do that in acpi_pnp_scan_handler_attach() ?
> > > 
> > because the PNP bus is not ready at that time.
> 
> Do you mean it hasn't been initialized yet?
> 
right.
both acpi_init() and pnp_init() are subsys_initcall, but pnp_init() is
invoked later because of the link order.

> But we can initialize it before the scan handler initialization, can't we?
> 
well, I'm not sure, here is what I get from drivers/Makefile
obj-$(CONFIG_ACPI)  += acpi/
obj-$(CONFIG_SFI)   += sfi/
# PnP must come after ACPI since it will eventually need to check ifacpi
# was used and do nothing if so
obj-$(CONFIG_PNP)   += pnp/

this comment was added long time ago and I do not know the background of
it, maybe Len knows the reason of this?

thanks,
rui

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2][RESENT] dma: add R-Car Audio DMAC peri peri driver

2014-03-09 Thread Kuninori Morimoto

Hi Joe

> > diff --git a/drivers/dma/sh/rcar-audmapp.c b/drivers/dma/sh/rcar-audmapp.c
> []
> > +static int audmapp_chan_probe(struct platform_device *pdev,
> > + struct audmapp_device *audev, int id)
> > +{
> []
> > +   auchan = devm_kzalloc(dev, sizeof(struct audmapp_chan), GFP_KERNEL);
> > +   if (!auchan) {
> > +   dev_err(dev, "No free memory for allocating dma channels!\n");
> 
> Unnecessary OOM as the alloc has a generic OOM
> and a dump_stack()
> 
> []
> > +static int audmapp_probe(struct platform_device *pdev)
> > +{
> []
> > +   audev = devm_kzalloc(>dev, sizeof(struct audmapp_device),
> > +GFP_KERNEL);
> > +   if (!audev) {
> > +   dev_err(>dev, "Not enough memory\n");
> 
> here too

Thank you.
will fix in v3
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] watchdog: bcm281xx: Fix Kconfig dependency

2014-03-09 Thread Guenter Roeck

On 03/05/2014 02:57 AM, Markus Mayer wrote:

Use more the applicable ARCH_BCM_MOBILE option instead of ARCH_BCM as
dependency for bcm_kona_wdt.c.

Signed-off-by: Markus Mayer 


Acked-by: Guenter Roeck 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] drm/fb-helper: Do the mode_set.connectors ZERO_SIZE_PTR check

2014-03-09 Thread li.xi...@freescale.com

> Subject: Re: [PATCH] drm/fb-helper: Do the mode_set.connectors ZERO_SIZE_PTR
> check
> 
> On Thu, 06 Mar 2014, Xiubo Li  wrote:
> > Since we cannot make sure the 'max_conn_count' will always be none
> > zero from the users, and then if max_conn_count equals to zero, the
> > kcalloc() will return ZERO_SIZE_PTR, which equals to ((void *)16).
> >
> > So this patch fix this via doing the zero pionter check of it.
> 
> Please just add a check for max_conn_count == 0 up front and handle it.
> 
> BR,
> Jani.
> 

Yes, that's one better choice.
See the next version please.

Thanks very much!

--
Best Regards,
Xiubo 

> 
> >
> > Signed-off-by: Xiubo Li 
> > ---
> >  drivers/gpu/drm/drm_fb_helper.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/drm_fb_helper.c
> b/drivers/gpu/drm/drm_fb_helper.c
> > index 3d13ca6e2..c6680ef 100644
> > --- a/drivers/gpu/drm/drm_fb_helper.c
> > +++ b/drivers/gpu/drm/drm_fb_helper.c
> > @@ -536,7 +536,7 @@ int drm_fb_helper_init(struct drm_device *dev,
> > sizeof(struct drm_connector *),
> > GFP_KERNEL);
> >
> > -   if (!fb_helper->crtc_info[i].mode_set.connectors)
> > +   if 
> > (ZERO_OR_NULL_PTR(fb_helper->crtc_info[i].mode_set.connectors))
> > goto out_free;
> > fb_helper->crtc_info[i].mode_set.num_connectors = 0;
> > }
> > --
> > 1.8.4
> >
> >
> > ___
> > dri-devel mailing list
> > dri-de...@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/dri-devel
> 
> --
> Jani Nikula, Intel Open Source Technology Center
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] zram: use scnprintf() in attrs show() methods

2014-03-09 Thread Minchan Kim
On Sun, Mar 09, 2014 at 10:27:08PM +0300, Sergey Senozhatsky wrote:
> sysfs.txt documentation lists the following requirements:
> 
>  - The buffer will always be PAGE_SIZE bytes in length. On i386, this
>is 4096.
> 
>  - show() methods should return the number of bytes printed into the
>buffer. This is the return value of scnprintf().
> 
>  - show() should always use scnprintf().
> 
> Use scnprintf() in show() functions.
> 
> Signed-off-by: Sergey Senozhatsky 

Acked-by: Minchan Kim 

> ---
>  drivers/block/zram/zcomp.c|  8 +---
>  drivers/block/zram/zram_drv.c | 12 ++--
>  2 files changed, 11 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c
> index 92a83df..ffcaff7 100644
> --- a/drivers/block/zram/zcomp.c
> +++ b/drivers/block/zram/zcomp.c
> @@ -274,12 +274,14 @@ ssize_t zcomp_available_show(const char *comp, char 
> *buf)
>  
>   while (backends[i]) {
>   if (sysfs_streq(comp, backends[i]->name))
> - sz += sprintf(buf + sz, "[%s] ", backends[i]->name);
> + sz += scnprintf(buf + sz, PAGE_SIZE - sz - 2,
> + "[%s] ", backends[i]->name);
>   else
> - sz += sprintf(buf + sz, "%s ", backends[i]->name);
> + sz += scnprintf(buf + sz, PAGE_SIZE - sz - 2,
> + "%s ", backends[i]->name);
>   i++;
>   }
> - sz += sprintf(buf + sz, "\n");
> + sz += scnprintf(buf + sz, PAGE_SIZE - sz, "\n");
>   return sz;
>  }
>  
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index 7631ef0..e6ff420 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -47,7 +47,7 @@ static ssize_t zram_attr_##name##_show(struct device *d,
> \
>   struct device_attribute *attr, char *b) \
>  {\
>   struct zram *zram = dev_to_zram(d); \
> - return sprintf(b, "%llu\n", \
> + return scnprintf(b, PAGE_SIZE, "%llu\n",\
>   (u64)atomic64_read(>stats.name)); \
>  }\
>  static struct device_attribute dev_attr_##name = \
> @@ -68,7 +68,7 @@ static ssize_t disksize_show(struct device *dev,
>  {
>   struct zram *zram = dev_to_zram(dev);
>  
> - return sprintf(buf, "%llu\n", zram->disksize);
> + return scnprintf(buf, PAGE_SIZE, "%llu\n", zram->disksize);
>  }
>  
>  static ssize_t initstate_show(struct device *dev,
> @@ -81,7 +81,7 @@ static ssize_t initstate_show(struct device *dev,
>   val = init_done(zram);
>   up_read(>init_lock);
>  
> - return sprintf(buf, "%u\n", val);
> + return scnprintf(buf, PAGE_SIZE, "%u\n", val);
>  }
>  
>  static ssize_t orig_data_size_show(struct device *dev,
> @@ -89,7 +89,7 @@ static ssize_t orig_data_size_show(struct device *dev,
>  {
>   struct zram *zram = dev_to_zram(dev);
>  
> - return sprintf(buf, "%llu\n",
> + return scnprintf(buf, PAGE_SIZE, "%llu\n",
>   (u64)(atomic64_read(>stats.pages_stored)) << PAGE_SHIFT);
>  }
>  
> @@ -105,7 +105,7 @@ static ssize_t mem_used_total_show(struct device *dev,
>   val = zs_get_total_size_bytes(meta->mem_pool);
>   up_read(>init_lock);
>  
> - return sprintf(buf, "%llu\n", val);
> + return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
>  }
>  
>  static ssize_t max_comp_streams_show(struct device *dev,
> @@ -118,7 +118,7 @@ static ssize_t max_comp_streams_show(struct device *dev,
>   val = zram->max_comp_streams;
>   up_read(>init_lock);
>  
> - return sprintf(buf, "%d\n", val);
> + return scnprintf(buf, PAGE_SIZE, "%d\n", val);
>  }
>  
>  static ssize_t max_comp_streams_store(struct device *dev,
> -- 
> 1.9.0.477.g83111a0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] net: add a pre-check of net_ns in sk_change_net()

2014-03-09 Thread Gu Zheng
We do not need to switch the net_ns if the target net_ns the same
as the current one, so here we add a pre-check of net_ns to avoid
this as David suggested.

Signed-off-by: Gu Zheng 
---
 include/net/sock.h |8 ++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 5c3f7c3..9678569 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -2252,8 +2252,12 @@ void sock_net_set(struct sock *sk, struct net *net)
  */
 static inline void sk_change_net(struct sock *sk, struct net *net)
 {
-   put_net(sock_net(sk));
-   sock_net_set(sk, hold_net(net));
+   struct net *current_net = sock_net(sk);
+
+   if (!net_eq(current_net, net)) {
+   put_net(current_net);
+   sock_net_set(sk, hold_net(net));
+   }
 }
 
 static inline struct sock *skb_steal_sock(struct sk_buff *skb)
-- 
1.7.7


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] zram: propagate error to user

2014-03-09 Thread Minchan Kim
Hello Sergey,

On Sun, Mar 09, 2014 at 07:58:51PM +0300, Sergey Senozhatsky wrote:
> Hello Minchan,
> 
> On (03/07/14 18:51), Minchan Kim wrote:
> > Hello Sergey!
> > 
> > On Fri, Mar 07, 2014 at 12:20:45PM +0300, Sergey Senozhatsky wrote:
> > > On (03/07/14 10:56), Minchan Kim wrote:
> > > > When we initialized zcomp with single, we couldn't change
> > > > max_comp_streams without zram reset but current interface doesn't
> > > > show any error to user and even it changes max_comp_streams's value
> > > > without any effect so it would make user very confusing.
> > > > 
> > > > This patch prevents max_comp_streams's change when zcomp was
> > > > initialized as single zcomp and emit the error to user(ex, echo).
> > > > 
> > > > Signed-off-by: Minchan Kim 
> > > > ---
> > > >  Documentation/blockdev/zram.txt |  9 +
> > > >  drivers/block/zram/zcomp.c  | 10 +-
> > > >  drivers/block/zram/zcomp.h  |  4 ++--
> > > >  drivers/block/zram/zram_drv.c   | 15 +++
> > > >  4 files changed, 23 insertions(+), 15 deletions(-)
> > > > 
> > > > diff --git a/Documentation/blockdev/zram.txt 
> > > > b/Documentation/blockdev/zram.txt
> > > > index 2604ffed51db..0595c3f56ccf 100644
> > > > --- a/Documentation/blockdev/zram.txt
> > > > +++ b/Documentation/blockdev/zram.txt
> > > > @@ -37,10 +37,11 @@ Note:
> > > >  In order to enable compression backend's multi stream support 
> > > > max_comp_streams
> > > >  must be initially set to desired concurrency level before ZRAM device
> > > >  initialisation. Once the device initialised as a single stream 
> > > > compression
> > > > -backend (max_comp_streams equals to 0) changing the value of 
> > > > max_comp_streams
> > > > -will not take any effect, because single stream compression backend 
> > > > implemented
> > > > -as a special case and does not support dynamic max_comp_streams. Only 
> > > > multi
> > > > -stream backend supports dynamic max_comp_streams adjustment.
> > > > +backend (max_comp_streams equals to 1), you will see error if you try 
> > > > to change
> > > > +the value of max_comp_streams because single stream compression backend
> > > > +implemented as a special case by lock overhead issue and does not 
> > > > support
> > > > +dynamic max_comp_streams. Only multi stream backend supports dynamic
> > > > +max_comp_streams adjustment.
> > > >  
> > > >  3) Select compression algorithm
> > > > Using comp_algorithm device attribute one can see available and
> > > > diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c
> > > > index 92a83df40a27..15fe6a27781b 100644
> > > > --- a/drivers/block/zram/zcomp.c
> > > > +++ b/drivers/block/zram/zcomp.c
> > > > @@ -152,7 +152,7 @@ static void zcomp_strm_multi_release(struct zcomp 
> > > > *comp, struct zcomp_strm *zstr
> > > >  }
> > > >  
> > > >  /* change max_strm limit */
> > > > -static int zcomp_strm_multi_set_max_streams(struct zcomp *comp, int 
> > > > num_strm)
> > > > +static bool zcomp_strm_multi_set_max_streams(struct zcomp *comp, int 
> > > > num_strm)
> > > >  {
> > > > struct zcomp_strm_multi *zs = comp->stream;
> > > > struct zcomp_strm *zstrm;
> > > > @@ -171,7 +171,7 @@ static int zcomp_strm_multi_set_max_streams(struct 
> > > > zcomp *comp, int num_strm)
> > > > zs->avail_strm--;
> > > > }
> > > > spin_unlock(>strm_lock);
> > > > -   return 0;
> > > > +   return true;
> > > >  }
> > > >  
> > > >  static void zcomp_strm_multi_destroy(struct zcomp *comp)
> > > > @@ -231,10 +231,10 @@ static void zcomp_strm_single_release(struct 
> > > > zcomp *comp,
> > > > mutex_unlock(>strm_lock);
> > > >  }
> > > >  
> > > > -static int zcomp_strm_single_set_max_streams(struct zcomp *comp, int 
> > > > num_strm)
> > > > +static bool zcomp_strm_single_set_max_streams(struct zcomp *comp, int 
> > > > num_strm)
> > > >  {
> > > > /* zcomp_strm_single support only max_comp_streams == 1 */
> > > > -   return -ENOTSUPP;
> > > > +   return 0;
> > > >  }
> > > 
> > > IMHO, -ENOTSUPP for unsupported operation fits better than `false'.
> > > yes, currently there are only two possible returns:
> > >   0 -- success
> > >   -ENOTSUPP - not supported operation
> > > 
> > > though, we can extend functions later and return additional codes, other
> > > than `false' and `true'.
> > 
> > Thing to expose to user isn't true and false but EINVAL.
> > 
> 
> sure. I mean we can return actual zcomp_set_max_streams() error (if any)
> back to user from max_comp_streams_store():
> 
>   [..]
>   ret = zcomp_set_max_streams(...);
>   [..]
>   return ret;
> 
> > > 
> > > for example, -E2BIG if user requested extremly large number of streams,
> > > like 5000 streams.
> > 
> > I'm not sure it's right example for E2BIG.
> > When I read the comment, it says "argument list too long".
> > Anyway, when I tried ENOTSUPP, echo doesn't show meaningful error to user
> > and I dont' know it's casual err 

Re: [PATCH v2 1/2] i2c: add DMA support for freescale i2c driver

2014-03-09 Thread Shawn Guo
On Thu, Mar 06, 2014 at 12:57:42PM +0100, Marek Vasut wrote:
> On Thursday, March 06, 2014 at 06:02:03 AM, Yao Yuan wrote:
> > On Thu, March 06, 2014 at 12:44:14 PM, Marek Vasut wrote:
> > > On Thursday, March 06, 2014 at 05:36:14 AM, Yao Yuan wrote:
> > > > On Thu, March 06, 2014 at 11:23:50 AM, Marek Vasut wrote:
> > > > > On Wednesday, March 05, 2014 at 07:52:31 AM, Yuan Yao wrote:
> > > > > > Add dma support for i2c. This function depend on DMA driver.
> > > > > > You can turn on it by write both the dmas and dma-name properties
> > > > > > in dts node.
> > > > > > 
> > > > > > Signed-off-by: Yuan Yao 
> > > > > > ---
> > > > > 
> > > > > [...]
> > > > > 
> > > > > > @@ -601,6 +826,7 @@ static int i2c_imx_probe(struct
> > > > > > platform_device
> > > > > 
> > > > > *pdev)
> > > > > 
> > > > > > void __iomem *base;
> > > > > > int irq, ret;
> > > > > > u32 bitrate;
> > > > > > 
> > > > > > +   u32 phy_addr;
> > > > > > 
> > > > > > dev_dbg(>dev, "<%s>\n", __func__);
> > > > > > 
> > > > > > @@ -611,6 +837,7 @@ static int i2c_imx_probe(struct
> > > > > > platform_device
> > > > > 
> > > > > *pdev)
> > > > > 
> > > > > > }
> > > > > > 
> > > > > > res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> > > > > > 
> > > > > > +   phy_addr = res->start;
> > > > > 
> > > > > Uh ... Shawn, I really think I am lost here. Don't you need to map
> > > > > this memory before you can use it for DMA ? The DMA mapping function
> > > > > should give you the physical address and is the right way to go
> > > > > about this instead of pulling the address from here, no ?
> > > > > 
> > > > > I might be wrong here, I am rather uncertain, so please help me out.
> > > > > Thanks!
> > > > 
> > > > Hi, Marek, Thanks for your suggestion.
> > > > Here you can review the code in include/linux/ioport.h The
> > > > resource->start describes the entity on the CPU bus as a starting
> > > > physical address. So I thinks it can used for dma directly.
> > > 
> > > This doesn't feel right for some reason. If this is a register area, you
> > > should
> > > ioremap() it. If it's a memory area you do DMA to/from, you need to make
> > > sure you correctly flush/invalidate caches and properly handle the
> > > effects the write buffer might have. But I have a feeling you actually do
> > > DMA to/from register space here ?
> > 
> > Yes, It's a register area. But I don't know why I should ioremap() it? It's
> > a bus address and DMA can use it directly. Is there some problem for my
> > understanding ?
> 
> I am not too sure here, thus I am poking someone who can clearly answer this.

There is already a devm_ioremap_resource() call in the existing code for
CPU to access registers in virtual address.  And my understanding on
Yuan's patch is that he needs the physical address of I2C DATA register
for DMA from/to the controller.

Shawn

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] x86 fixes for v3.14-rc6 - bad patch dropped

2014-03-09 Thread Linus Torvalds
On Sun, Mar 9, 2014 at 5:15 PM, H. Peter Anvin  wrote:
>
> The same collection of fixes except the broken NMI patch dropped.  I
> will send a fixed version of that plus Suresh' FPU fix in a few days,
> to get them some testing, plus I will be on a trip (part of why I got
> unduly rushed this past Friday.  Sorry again for that.)

I already pulled the broken one, and had actually pushed it out in
that broken format, which was so upsetting. Normally I tend to delay
pushing things out until I've done my compile tests, but I had
stupidly thought I didn't need to that time.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.13.?: Strange / dangerous fan policy...

2014-03-09 Thread Manuel Krause
cooling_devices and 
thermal_zones separately for each bad/good kernel. For Emailing 
purposes only. You can merge them into a spreadsheet for your 
evaluation on your own. I've left out reporting some subdirs and 
subdir's values that _really_ didn't seem to need attention.


Also, I've had collected the #sensors output for each readout, 
having reproduced nearly the same workload, represented by the 
"Fan speed" (thermal_zone4==FDTZ).


And I've done my very best to not produce typos or c errors.


 3.13.5 -- 20140309 -- 20:52 -- bad
=
dir |-
 /type   /cur_state  /max_state
cooling_device0  Processor0  10
cooling_device1  Processor0  10
cooling_device2  Fan  0   1
cooling_device3  Fan  1   1
cooling_device4  Fan  0   1
cooling_device5  Fan  0   1
cooling_device6  Fan  0   1
cooling_device7  LCD  0  24

 3.12.13 -- 20140310 -- 00:26 -- good
==
dir |-
 /type   /cur_state  /max_state
cooling_device0  Processor0  10
cooling_device1  Processor0  10
cooling_device2  Fan  0   1
cooling_device3  Fan  1   1
cooling_device4  Fan  1   1
cooling_device5  Fan  1   1
cooling_device6  Fan  1   1
cooling_device7  LCD  0  24


 3.13.5 -- 20140309 -- 20:52 -- bad
=
dir  |-
  /passive /temp  |- /cdev?_  /trip_   /trip_
  trip_point_   point_
  point?_temp   ?_type
thermal_zone0  068000   ?=0n.a.   256000   critical
thermal_zone1   n.a.7 |-
?=0   6   11   critical
?=1   5   107000   passive
?=2   49   active
?=3   375000   active
?=4   255000   active
?=5   145000   active
?=6   13   active
thermal_zone2   n.a.54000 |-
?=0   1   105000   critical
?=1   195000   passive
thermal_zone3   n.a.25800 |-
?=0   1   11   critical
?=1   16   passive
thermal_zone4  058000   ?=0n.a.   11   critical


 3.12.13 -- 20140310 -- 00:26 -- good
==
dir  |-
  /passive /temp  |- /cdev?_  /trip_   /trip_
  trip_point_   point_
  point?_temp   ?_type
thermal_zone0  05   ?=0n.a.   256000   critical
thermal_zone1   n.a.7 |-
?=0   1   11   critical
?=1   1   107000   passive
?=2   29   active
?=3   367000   active
?=4   455000   active
?=5   545000   active
?=6   63   active
thermal_zone2   n.a.53000 |-
?=0   1   105000   critical
?=1   195000   passive
thermal_zone3   n.a.25600 |-
?=0   1   11   critical
?=1   16   passive
thermal_zone4  058000   ?=0n.a.   11   critical

---
Legend here:
   /type  is always  acpitz
   /mode enabled
   /policy   step_wise

  - from kernel ACPI initialisation: thermal_zone0==DTSZ,
 thermal_zone1==CPUZ, thermal_zone2==SKNZ,
 thermal_zone3==BATZ, thermal_zone4==FDTZ
  - n.a. means  file or value is not available
___
Legend in general:
 /power/control  is always  auto
 /power/runtime_status  unsupported
 /uevent''==empty



 3.13.5 -- 20140309 -- 20:52 -- bad
=
# sensors
acpitz-virtual-0
Adapter: Virtual device
temp1:+68.0°C  (crit = +256.0°C)
temp2:+70.0°C  (crit = +110.0°C)
temp3:+54.0°C  (crit = +105.0°C)
temp4:+25.8°C  (crit = +110.0°C)
temp5:+58.0°C  (crit = +110.0°C)

coretemp-isa-
Adapter: ISA adapter
Core 0:   +66.0°C  (high = +105.0°C, crit = +105.0°C)
Core 1:   +63.0°C  (high = +105.0°C, crit = +105.0°C)


 3.12.13 -- 20140310 -- 00:26 -- good
=

Re: [PATCH v2][RESENT] dma: add R-Car Audio DMAC peri peri driver

2014-03-09 Thread Joe Perches
On Sun, 2014-03-09 at 18:34 -0700, Kuninori Morimoto wrote:
> Add support Audio DMAC peri peri driver
> for Renesas R-Car Gen2 SoC, using 'shdma-base'
> DMA driver framework.

Trivial notes:

> diff --git a/drivers/dma/sh/rcar-audmapp.c b/drivers/dma/sh/rcar-audmapp.c
[]
> +static int audmapp_chan_probe(struct platform_device *pdev,
> +   struct audmapp_device *audev, int id)
> +{
[]
> + auchan = devm_kzalloc(dev, sizeof(struct audmapp_chan), GFP_KERNEL);
> + if (!auchan) {
> + dev_err(dev, "No free memory for allocating dma channels!\n");

Unnecessary OOM as the alloc has a generic OOM
and a dump_stack()

[]
> +static int audmapp_probe(struct platform_device *pdev)
> +{
[]
> + audev = devm_kzalloc(>dev, sizeof(struct audmapp_device),
> +  GFP_KERNEL);
> + if (!audev) {
> + dev_err(>dev, "Not enough memory\n");

here too


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] [staging][r8188eu]: memory leak in rtw_free_cmd_obj if command is (_Set_Drv_Extra)

2014-03-09 Thread Liu, Chuansheng
Hi,

> -Original Message-
> From: Wang, Xiaoming
> Sent: Monday, March 10, 2014 11:38 PM
> To: gre...@linuxfoundation.org; valentina.mane...@gmail.com;
> dan.carpen...@oracle.com; standby2...@gmail.com
> Cc: de...@driverdev.osuosl.org; linux-kernel@vger.kernel.org; Zhang,
> Dongxing; Wang, Xiaoming; Liu, Chuansheng
> Subject: [PATCH] [staging][r8188eu]: memory leak in rtw_free_cmd_obj if
> command is (_Set_Drv_Extra)
> 
> pcmd->parmbuf->pbuf has been allocated if command is
> GEN_CMD_CODE(_Set_Drv_Extra),
> and it enqueued by rtw_enqueue_cmd. rtw_cmd_thread dequeue pcmd by
> rtw_dequeue_cmd.
> The memory leak happened on this branch "if( _FAIL ==
> rtw_cmd_filter(pcmdpriv, pcmd) )"
> which goto post_process directly against freeing pcmd->parmbuf->pbuf in
> rtw_drvextra_cmd_hdl which is the cmd_hdl if command is (_Set_Drv_Extra).
> This patch free pcmd->parmbuf->pbuf on the forgotten branch to avoid memory
> leak.
> 
> Signed-off-by: Zhang Dongxing 
> Signed-off-by: xiaoming wang 
> 
> diff --git a/drivers/staging/rtl8188eu/core/rtw_cmd.c
> b/drivers/staging/rtl8188eu/core/rtw_cmd.c
> index c0a0a52..1c7f505 100644
> --- a/drivers/staging/rtl8188eu/core/rtw_cmd.c
> +++ b/drivers/staging/rtl8188eu/core/rtw_cmd.c
> @@ -288,7 +288,7 @@ int rtw_cmd_thread(void *context)
> void (*pcmd_callback)(struct adapter *dev, struct cmd_obj *pcmd);
> struct adapter *padapter = (struct adapter *)context;
> struct cmd_priv *pcmdpriv = &(padapter->cmdpriv);
> -
> +   struct drvextra_cmd_parm *extra_parm = NULL;
> 
> thread_enter("RTW_CMD_THREAD");
> 
> @@ -323,6 +323,11 @@ _next:
> 
> if (_FAIL == rtw_cmd_filter(pcmdpriv, pcmd)) {
> pcmd->res = H2C_DROPPED;
> +   if (pcmd->cmdcode ==
> GEN_CMD_CODE(_Set_Drv_Extra)) {
> +   extra_parm = (struct
> drvextra_cmd_parm *)pcmd->parmbuf;
> +   if (extra_parm && extra_parm->pbuf
> && extra_parm->size > 0)
> +   rtw_mfree(extra_parm->pbuf,
> extra_parm->size);
> +   }
> goto post_process;
> }
> 

Reviewed-by: Chuansheng Liu 

Thanks.

Best Regards
Chuansheng



Re: [f2fs-dev] [PATCH 1/5] f2fs: update start nid only once each circle

2014-03-09 Thread Gu Zheng
On 03/08/2014 07:46 PM, Chao Yu wrote:

> Hi Gu,
> 
>> -Original Message-
>> From: Gu Zheng [mailto:guz.f...@cn.fujitsu.com]
>> Sent: Friday, March 07, 2014 6:43 PM
>> To: Kim
>> Cc: linux-kernel; f2fs
>> Subject: [f2fs-dev] [PATCH 1/5] f2fs: update start nid only once each circle
>>
>>
>> Signed-off-by: Gu Zheng 
> 
> Reviewed-by: Chao Yu 
> 
>> ---
>>  fs/f2fs/node.c |6 +-
>>  1 files changed, 5 insertions(+), 1 deletions(-)
>>
>> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
>> index 8c14110..9653096 100644
>> --- a/fs/f2fs/node.c
>> +++ b/fs/f2fs/node.c
>> @@ -1875,11 +1875,15 @@ void destroy_node_manager(struct f2fs_sb_info *sbi)
>>  while ((found = __gang_lookup_nat_cache(nm_i,
>>  nid, NATVEC_SIZE, natvec))) {
>>  unsigned idx;
>> +
>> +nid = nat_get_nid(natvec[found - 1]) + 1;
>> +
>>  for (idx = 0; idx < found; idx++) {
>>  struct nat_entry *e = natvec[idx];
> 
> Could we replace argument 'e' with 'natvec[idx]'? then we could remove 'e' and
> brace here.

Agree. More neat with this cleanup.

Regards,
Gu

> 
> Thanks.
> 
>> -nid = nat_get_nid(e) + 1;
>> +
>>  __del_from_nat_cache(nm_i, e);
>>  }
>> +
>>  }
>>  f2fs_bug_on(nm_i->nat_cnt);
>>  write_unlock(_i->nat_tree_lock);
>> --
>> 1.7.7
>>
>>
>> --
>> Subversion Kills Productivity. Get off Subversion & Make the Move to 
>> Perforce.
>> With Perforce, you get hassle-free workflows. Merge that actually works.
>> Faster operations. Version large binaries.  Built-in WAN optimization and the
>> freedom to use Git, Perforce or both. Make the move to Perforce.
>> http://pubads.g.doubleclick.net/gampad/clk?id=122218951=/4140/ostg.clktrk
>> ___
>> Linux-f2fs-devel mailing list
>> linux-f2fs-de...@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> 
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2][RESENT] dma: add R-Car Audio DMAC peri peri driver

2014-03-09 Thread Kuninori Morimoto
From: Kuninori Morimoto 

Add support Audio DMAC peri peri driver
for Renesas R-Car Gen2 SoC, using 'shdma-base'
DMA driver framework.

Signed-off-by: Kuninori Morimoto 
---
resent

 - add missing "dmaeng...@vger.kernel.org"

v1 -> v2

 - run scripts/checkpatch.pl
 - ecchange length settings on audmapp_desc_setup()
 - exchange slave_id check on audmapp_find_slave()

 drivers/dma/sh/Kconfig |6 +
 drivers/dma/sh/Makefile|1 +
 drivers/dma/sh/rcar-audmapp.c  |  325 
 include/linux/platform_data/dma-rcar-audmapp.h |   34 +++
 4 files changed, 366 insertions(+)
 create mode 100644 drivers/dma/sh/rcar-audmapp.c
 create mode 100644 include/linux/platform_data/dma-rcar-audmapp.h

diff --git a/drivers/dma/sh/Kconfig b/drivers/dma/sh/Kconfig
index dadd9e01..b4c8138 100644
--- a/drivers/dma/sh/Kconfig
+++ b/drivers/dma/sh/Kconfig
@@ -29,6 +29,12 @@ config RCAR_HPB_DMAE
help
  Enable support for the Renesas R-Car series DMA controllers.
 
+config RCAR_AUDMAC_PP
+   tristate "Renesas R-Car Audio DMAC Peripheral Peripheral support"
+   depends on SH_DMAE_BASE
+   help
+ Enable support for the Renesas R-Car Audio DMAC Peripheral Peripheral 
controllers.
+
 config SHDMA_R8A73A4
def_bool y
depends on ARCH_R8A73A4 && SH_DMAE != n
diff --git a/drivers/dma/sh/Makefile b/drivers/dma/sh/Makefile
index e856af2..1ce88b2 100644
--- a/drivers/dma/sh/Makefile
+++ b/drivers/dma/sh/Makefile
@@ -7,3 +7,4 @@ endif
 shdma-objs := $(shdma-y)
 obj-$(CONFIG_SUDMAC) += sudmac.o
 obj-$(CONFIG_RCAR_HPB_DMAE) += rcar-hpbdma.o
+obj-$(CONFIG_RCAR_AUDMAC_PP) += rcar-audmapp.o
diff --git a/drivers/dma/sh/rcar-audmapp.c b/drivers/dma/sh/rcar-audmapp.c
new file mode 100644
index 000..cd3c237
--- /dev/null
+++ b/drivers/dma/sh/rcar-audmapp.c
@@ -0,0 +1,325 @@
+/*
+ * drivers/dma/sh/rcar-audmapp.c
+ *
+ * Copyright (C) 2013 Renesas Electronics Corporation
+ * Copyright (C) 2013 Kuninori Morimoto 
+ *
+ * based on the drivers/dma/sh/shdma.c
+ *
+ * Copyright (C) 2011-2012 Guennadi Liakhovetski 
+ * Copyright (C) 2009 Nobuhiro Iwamatsu 
+ * Copyright (C) 2009 Renesas Solutions, Inc. All rights reserved.
+ * Copyright (C) 2007 Freescale Semiconductor, Inc. All rights reserved.
+ *
+ * This is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * DMA register
+ */
+#define PDMASAR0x00
+#define PDMADAR0x04
+#define PDMACHCR   0x0c
+
+/* PDMACHCR */
+#define PDMACHCR_DE(1 << 0)
+
+#define AUDMAPP_MAX_CHANNELS   29
+
+/* Default MEMCPY transfer size = 2^2 = 4 bytes */
+#define LOG2_DEFAULT_XFER_SIZE 2
+#define AUDMAPP_SLAVE_NUMBER   256
+#define AUDMAPP_LEN_MAX(16 * 1024 * 1024)
+
+struct audmapp_chan {
+   struct shdma_chan shdma_chan;
+   struct audmapp_slave_config *config;
+   void __iomem *base;
+};
+
+struct audmapp_device {
+   struct shdma_dev shdma_dev;
+   struct audmapp_pdata *pdata;
+   struct device *dev;
+   void __iomem *chan_reg;
+};
+
+#define to_chan(chan) container_of(chan, struct audmapp_chan, shdma_chan)
+#define to_dev(chan) container_of(chan->shdma_chan.dma_chan.device,\
+ struct audmapp_device, shdma_dev.dma_dev)
+
+static void audmapp_write(struct audmapp_chan *auchan, u32 data, u32 reg)
+{
+   struct audmapp_device *audev = to_dev(auchan);
+   struct device *dev = audev->dev;
+
+   dev_dbg(dev, "w %p : %08x\n", auchan->base + reg, data);
+
+   iowrite32(data, auchan->base + reg);
+}
+
+static u32 audmapp_read(struct audmapp_chan *auchan, u32 reg)
+{
+   return ioread32(auchan->base + reg);
+}
+
+static void audmapp_halt(struct shdma_chan *schan)
+{
+   struct audmapp_chan *auchan = to_chan(schan);
+   int i;
+
+   audmapp_write(auchan, 0, PDMACHCR);
+
+   for (i = 0; i < 1024; i++) {
+   if (0 == audmapp_read(auchan, PDMACHCR))
+   return;
+   udelay(1);
+   }
+}
+
+static void audmapp_start_xfer(struct shdma_chan *schan,
+  struct shdma_desc *sdecs)
+{
+   struct audmapp_chan *auchan = to_chan(schan);
+   struct audmapp_device *audev = to_dev(auchan);
+   struct audmapp_slave_config *cfg = auchan->config;
+   struct device *dev = audev->dev;
+   u32 chcr = cfg->chcr | PDMACHCR_DE;
+
+   dev_dbg(dev, "src/dst/chcr = %x/%x/%x\n",
+   cfg->src, cfg->dst, cfg->chcr);
+
+   audmapp_write(auchan, cfg->src, PDMASAR);
+   audmapp_write(auchan, cfg->dst, PDMADAR);
+   audmapp_write(auchan, chcr, PDMACHCR);
+}
+
+static 

[PATCH] [staging][r8188eu]: memory leak in rtw_free_cmd_obj if command is (_Set_Drv_Extra)

2014-03-09 Thread Wang, Xiaoming
pcmd->parmbuf->pbuf has been allocated if command is 
GEN_CMD_CODE(_Set_Drv_Extra),
and it enqueued by rtw_enqueue_cmd. rtw_cmd_thread dequeue pcmd by 
rtw_dequeue_cmd.
The memory leak happened on this branch "if( _FAIL == rtw_cmd_filter(pcmdpriv, 
pcmd) )"
which goto post_process directly against freeing pcmd->parmbuf->pbuf in
rtw_drvextra_cmd_hdl which is the cmd_hdl if command is (_Set_Drv_Extra).
This patch free pcmd->parmbuf->pbuf on the forgotten branch to avoid memory 
leak.

Signed-off-by: Zhang Dongxing 
Signed-off-by: xiaoming wang 

diff --git a/drivers/staging/rtl8188eu/core/rtw_cmd.c 
b/drivers/staging/rtl8188eu/core/rtw_cmd.c
index c0a0a52..1c7f505 100644
--- a/drivers/staging/rtl8188eu/core/rtw_cmd.c
+++ b/drivers/staging/rtl8188eu/core/rtw_cmd.c
@@ -288,7 +288,7 @@ int rtw_cmd_thread(void *context)
void (*pcmd_callback)(struct adapter *dev, struct cmd_obj *pcmd);
struct adapter *padapter = (struct adapter *)context;
struct cmd_priv *pcmdpriv = &(padapter->cmdpriv);
-
+   struct drvextra_cmd_parm *extra_parm = NULL;

thread_enter("RTW_CMD_THREAD");

@@ -323,6 +323,11 @@ _next:

if (_FAIL == rtw_cmd_filter(pcmdpriv, pcmd)) {
pcmd->res = H2C_DROPPED;
+   if (pcmd->cmdcode == GEN_CMD_CODE(_Set_Drv_Extra)) {
+   extra_parm = (struct drvextra_cmd_parm 
*)pcmd->parmbuf;
+   if (extra_parm && extra_parm->pbuf && 
extra_parm->size > 0)
+   rtw_mfree(extra_parm->pbuf, 
extra_parm->size);
+   }
goto post_process;
}


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


LK bug with SGH-T999V (d2tmo)

2014-03-09 Thread Pierre-Yves Tremblay
1. LK v9.2 and higher won't work at all on d2tmo nightlies
(CM11) and on d2lte nightlies (CM11) for SGH-T999V

2. The phone just can't run with this kernel anymore since version 9.2
(d2att) with cm11's d2tmo and version 9.4 (d2) with cm11's d2lte.
Phone app crashes repetitively with v9.4 of d2 with CM11's latest
nightlies of d2lte. With v9.1 of d2att and cm11's nightlies of d2tmo,
lk has always worked very well, same for previous versions of lk's
d2att combined with cm10's d2tmo nightlies. 9.2 and higher just
totally fail at boot with my phone. I don't remember exactly if it was
a phone app crash or something else, but it was a major fail at boot
just like it is now with the repetitive phone app crash caused by lk
9.4 (d2) with cm11's d2lte.

3. Repetitive phone app crash at boot

4.1 root@d2tmo:/ # cat /proc/version
Linux version 3.4.82-cyanogenmod-g5d45d15 (build03@cyanogenmod) (gcc
version 4.7 (GCC) ) #1 SMP PREEMPT Sat Mar 8 16:53:43 PST 2014

4.2 leankernel can't run on d2lte for my phone

5. v9.1 from d2att with cm11's d2tmo

8. SGH-T999V Canadian Galaxy S3 provided by Videotron running CM11.
The bug has been happening on both nightly versions of CM11's
d2tmo and nightly versions of d2lte.

I'm unable to run leankernel anymore since I have upgraded from d2tmo
to d2lte nightlies on CM11. I had been trying lk 9.2 (d2att) for a
while but it always failed. I can't use version 9.1 from d2att with
d2lte nightlies, so now i'm running without leankernel and I miss it.

1|root@d2tmo:/ # cat /proc/cpuinfo
cat /proc/cpuinfo
Processor   : ARMv7 Processor rev 0 (v7l)
processor   : 0
BogoMIPS: 13.50

processor   : 1
BogoMIPS: 13.50

Features: swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva id
ivt
CPU implementer : 0x51
CPU architecture: 7
CPU variant : 0x1
CPU part: 0x04d
CPU revision: 0

Hardware: SAMSUNG M2
Revision: 0010
Serial  : a107aa56
root@d2tmo:/ # cat /proc/modules
cat /proc/modules
dhd 324767 0 - Live 0x
root@d2tmo:/ # cat /proc/ioports
cat /proc/ioports
0008-0008 : i2c_sda
0009-0009 : i2c_clk
000a-000a : gpio_host_wake
000c-000c : a2220_sda
000d-000d : a2220_sck
0026-0026 : spi_mosi
0027-0027 : spi_miso
0028-0028 : spi_cs
0029-0029 : spi_clk
004f-004f : gpio_ext_wake
0054-0058 : wcnss_gpios_5wire
011d-011d : pmic_rtc_base
root@d2tmo:/ # cat /proc/iomem
cat /proc/iomem
0050-00500fff : msm_ssbi.0
0070-007060ff : hdmi_msm_qfprom_addr
0080207c-0080207f : slimbus_slew_reg
0300-0327 : wcnss_mmio
  03204000-032040ff : pil_riva
0410-04100fff : kgsl_2d0_reg_memory
  0410-04100fff : kgsl-2d0
0420-04200fff : kgsl_2d1_reg_memory
  0420-04200fff : kgsl-2d1
0430-0431 : kgsl_3d0_reg_memory
  0430-0431 : kgsl-3d0
0440-044f : msm_vidc.0
0450-045f : vfe32
  0450-045f : msm_vfe
0460-046f : msm_gemini.0
0470-047e : mipi_dsi
0480-048003ff : csid
  0480-048003ff : msm_csid
04800400-048007ff : csid
  04800400-048007ff : msm_csid
04800800-04800bff : ispif
  04800800-04800bff : msm_ispif
04800c00-04800fff : csiphy
  04800c00-04800fff : msm_csiphy
04801000-048013ff : csiphy
  04801000-048013ff : msm_csiphy
04a0-04a00fff : hdmi_msm_hdmi_addr
04e0-04ef : msm_rotator.0
0510-051e : mdp
0530-053f : vpe
  0530-053f : msm_vpe
0730-073f : physbase
  0730-073f : physbase
0740-074f : physbase
  0740-074f : physbase
0750-075f : physbase
  0750-075f : physbase
0760-076f : physbase
  0760-076f : physbase
0770-077f : physbase
  0770-077f : physbase
0780-078f : physbase
  0780-078f : physbase
0790-079f : physbase
  0790-079f : physbase
07a0-07af : physbase
  07a0-07af : physbase
07b0-07bf : physbase
  07b0-07bf : physbase
07c0-07cf : physbase
  07c0-07cf : physbase
07d0-07df : physbase
  07d0-07df : physbase
07e0-07ef : physbase
  07e0-07ef : physbase
0880-088000ff : pil_qdsp6v4.1
0890-089000ff : pil_qdsp6v4.2
08b0-08b3 : pil_qdsp6v4.1
  08b0-08b3 : pil_qdsp6v4.2
1218-121807ff : core_mem
12180800-12181fff : dml_mem
12182000-12183fff : bam_mem
121c-121c07ff : core_mem
121c0800-121c1fff : dml_mem
121c2000-121c3fff : bam_mem
1224-12240fff : bamdma_dma
12244000-12247fff : bamdma_bam
1240-124007ff : core_mem
12400800-12401fff : dml_mem
12402000-12403fff : bam_mem
1244-12440003 : gsbi_base
  1244-12440003 : spi_qsd
1246-12460fff : spi_base
  1246-12460fff : spi_qsd
1248-12480003 : gsbi_qup_i2c_addr
  1248-12480003 : qup_i2c
124a-124a0fff : qup_phys_addr
  124a-124a0fff : qup_i2c
1250-12501000 : msm_hsusb
  1250-12501000 : msm_otg
1250-12500fff : 

Re: [PATCH] netlink: switch net_ns only if net is not init_net

2014-03-09 Thread Gu Zheng
On 03/10/2014 07:09 AM, David Miller wrote:

> From: Gu Zheng 
> Date: Fri, 07 Mar 2014 18:47:30 +0800
> 
>> Many netlink users create netlink sock in the init_net, and the
>> switching nes_ns(init_net-->net) is needless in this case. So here
>> we add a pre-check to avoid this.
>>
>> Signed-off-by: Gu Zheng 
> 
> This check is more appropriately placed into sk_change_net() itself.

Agree. I'll update it soon.

Regards,
Gu

> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 5/5] x86: Always define BUG() and HAVE_ARCH_BUG, even with !CONFIG_BUG

2014-03-09 Thread Josh Triplett
This ensures that BUG() always has a definition that causes a trap (via
an undefined instruction), and that the compiler still recognizes the
code following BUG() as unreachable, avoiding warnings that would
otherwise appear (such as on non-void functions that don't return a
value after BUG()).

In addition to saving a few bytes over the generic infinite-loop
implementation, this implementation traps rather than looping, which
potentially allows for better error-recovery behavior (such as by
rebooting).

Reported-by: Arnd Bergmann 
Signed-off-by: Josh Triplett 
---
v3: New patch for x86-optimized implementation of a simple bug for
!CONFIG_BUG, based on Arnd Bergmann's proposal.

 arch/x86/include/asm/bug.h | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/arch/x86/include/asm/bug.h b/arch/x86/include/asm/bug.h
index 2f03ff0..ba38ebb 100644
--- a/arch/x86/include/asm/bug.h
+++ b/arch/x86/include/asm/bug.h
@@ -1,7 +1,6 @@
 #ifndef _ASM_X86_BUG_H
 #define _ASM_X86_BUG_H
 
-#ifdef CONFIG_BUG
 #define HAVE_ARCH_BUG
 
 #ifdef CONFIG_DEBUG_BUGVERBOSE
@@ -33,8 +32,6 @@ do {  
\
 } while (0)
 #endif
 
-#endif /* !CONFIG_BUG */
-
 #include 
 
 #endif /* _ASM_X86_BUG_H */
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 4/5] bug: Make BUG() always stop the machine

2014-03-09 Thread Josh Triplett
When !CONFIG_BUG and !HAVE_ARCH_BUG, define the generic BUG() as an
infinite loop rather than a no-op.  This avoids undefined behavior if
execution ever actually reaches BUG(), and avoids warnings about code
after BUG() (such as on non-void functions calling BUG() and then
not returning).

bloat-o-meter results:

add/remove: 0/0 grow/shrink: 43/10 up/down: 235/-98 (137)
function old new   delta
umount_collect   119 138 +19
notify_change306 324 +18
xstate_enable_boot_cpu   252 269 +17
kunmap54  70 +16
balloon_page_dequeue 112 126 +14
mm_take_all_locks223 233 +10
list_lru_walk_node   143 152  +9
vma_adjust  10591067  +8
pcpu_setup_first_chunk  11301138  +8
mm_drop_all_locks143 151  +8
ns_capable55  62  +7
anon_transport_class_unregister8  15  +7
srcu_init_notifier_head   35  41  +6
shrink_dcache_for_umount 174 180  +6
kunmap_high   99 105  +6
end_page_writeback43  49  +6
do_exit 13391345  +6
__kfifo_dma_out_prepare_r 86  92  +6
__kfifo_dma_in_prepare_r  90  96  +6
fixup_user_fault 120 125  +5
repair_env_string 73  77  +4
read_cache_pages_invalidate_page  56  60  +4
isolate_lru_pages.isra   142 146  +4
do_notify_parent_cldstop 255 259  +4
cpu_init 370 374  +4
utimes_common270 272  +2
tasklet_hi_action 91  93  +2
tasklet_action91  93  +2
set_pte_vaddr 46  48  +2
find_get_pages_tag   202 204  +2
early_iounmap185 187  +2
__native_set_fixmap   36  38  +2
__get_user_pages 822 824  +2
__early_ioremap  299 301  +2
yield_task_stop1   2  +1
tick_resume   37  38  +1
switched_to_stop   1   2  +1
switched_to_idle   1   2  +1
prio_changed_stop  1   2  +1
prio_changed_idle  1   2  +1
pm_qos_power_read111 112  +1
arch_cpu_idle_dead 1   2  +1
__insert_vmap_area   140 141  +1
sys_renameat 614 612  -2
mm_fault_error   297 295  -2
SyS_renameat 614 612  -2
sys_linkat   416 413  -3
SyS_linkat   416 413  -3
chmod_common 129 122  -7
proc_cap_handler 240 225 -15
__schedule   849 831 -18
sys_madvise 10771054 -23
SyS_madvise 10771054 -23

Reported-by: Arnd Bergmann 
Signed-off-by: Josh Triplett 
---
v3: New patch in this series, incorporating Arnd's suggestion to make BUG()
always stop execution.  This eliminates the new warnings currently appearing in
allnoconfig due to !CONFIG_BUG; unlike v2, this avoids using unreachable(), and
thus avoids the possibility of undefined behvior if execution actually reaches
a call to BUG().

 include/asm-generic/bug.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h
index a97fa11..630dd23 100644
--- a/include/asm-generic/bug.h
+++ b/include/asm-generic/bug.h
@@ -138,7 +138,7 @@ extern void warn_slowpath_null(const char *file, const int 
line);
 
 #else /* !CONFIG_BUG */
 #ifndef HAVE_ARCH_BUG
-#define BUG() do {} while (0)
+#define BUG() do {} while (1)
 #endif
 
 #ifndef HAVE_ARCH_BUG_ON
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  

[PATCH v3 3/5] bug: When !CONFIG_BUG, make WARN call no_printk to check format and args

2014-03-09 Thread Josh Triplett
The stub version of WARN for !CONFIG_BUG completely ignored its format
string and subsequent arguments; make it check them instead, using
no_printk.

Reported-by: Arnd Bergmann 
Signed-off-by: Josh Triplett 
---
v3: Patch unchanged from v2.
 include/asm-generic/bug.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h
index 2d54d8d..a97fa11 100644
--- a/include/asm-generic/bug.h
+++ b/include/asm-generic/bug.h
@@ -155,6 +155,7 @@ extern void warn_slowpath_null(const char *file, const int 
line);
 #ifndef WARN
 #define WARN(condition, format...) ({  \
int __ret_warn_on = !!(condition);  \
+   no_printk(format);  \
unlikely(__ret_warn_on);\
 })
 #endif
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 2/5] include/asm-generic/bug.h: Style fix: s/while(0)/while (0)/

2014-03-09 Thread Josh Triplett
Reported-by: Randy Dunlap 
Signed-off-by: Josh Triplett 
---
v3: Patch unchanged from v2.
 include/asm-generic/bug.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h
index 7ecd398..2d54d8d 100644
--- a/include/asm-generic/bug.h
+++ b/include/asm-generic/bug.h
@@ -52,7 +52,7 @@ struct bug_entry {
 #endif
 
 #ifndef HAVE_ARCH_BUG_ON
-#define BUG_ON(condition) do { if (unlikely(condition)) BUG(); } while(0)
+#define BUG_ON(condition) do { if (unlikely(condition)) BUG(); } while (0)
 #endif
 
 /*
@@ -138,11 +138,11 @@ extern void warn_slowpath_null(const char *file, const 
int line);
 
 #else /* !CONFIG_BUG */
 #ifndef HAVE_ARCH_BUG
-#define BUG() do {} while(0)
+#define BUG() do {} while (0)
 #endif
 
 #ifndef HAVE_ARCH_BUG_ON
-#define BUG_ON(condition) do { if (condition) ; } while(0)
+#define BUG_ON(condition) do { if (condition) ; } while (0)
 #endif
 
 #ifndef HAVE_ARCH_WARN_ON
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] regulator: act8865: Remove unnecessary *rdev[] from struct act8865

2014-03-09 Thread Yang, Wenyou


> -Original Message-
> From: Axel Lin [mailto:axel@ingics.com]
> Sent: Saturday, March 08, 2014 9:19 PM
> To: Mark Brown
> Cc: Yang, Wenyou; Liam Girdwood; linux-kernel@vger.kernel.org
> Subject: [PATCH] regulator: act8865: Remove unnecessary *rdev[] from
> struct act8865
> 
> Now we are using devm_regulator_register(), so we don't need the *rdev[]
> array to store return value of devm_regulator_register. Use a *rdev
> variable is enough for checking return status.
> 
> Signed-off-by: Axel Lin 
Acked-by: Wenyou Yang 
> ---
>  drivers/regulator/act8865-regulator.c | 13 +
>  1 file changed, 5 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/regulator/act8865-regulator.c
> b/drivers/regulator/act8865-regulator.c
> index a5ff30c..b92d7dd 100644
> --- a/drivers/regulator/act8865-regulator.c
> +++ b/drivers/regulator/act8865-regulator.c
> @@ -62,7 +62,6 @@
>  #define  ACT8865_VOLTAGE_NUM 64
> 
>  struct act8865 {
> - struct regulator_dev *rdev[ACT8865_REG_NUM];
>   struct regmap *regmap;
>  };
> 
> @@ -256,7 +255,7 @@ static inline int act8865_pdata_from_dt(struct
> device *dev,  static int act8865_pmic_probe(struct i2c_client *client,
>  const struct i2c_device_id *i2c_id)  {
> - struct regulator_dev **rdev;
> + struct regulator_dev *rdev;
>   struct device *dev = >dev;
>   struct act8865_platform_data *pdata = dev_get_platdata(dev);
>   struct regulator_config config = { };
> @@ -290,8 +289,6 @@ static int act8865_pmic_probe(struct i2c_client
> *client,
>   if (!act8865)
>   return -ENOMEM;
> 
> - rdev = act8865->rdev;
> -
>   act8865->regmap = devm_regmap_init_i2c(client,
> _regmap_config);
>   if (IS_ERR(act8865->regmap)) {
>   error = PTR_ERR(act8865->regmap);
> @@ -311,12 +308,12 @@ static int act8865_pmic_probe(struct i2c_client
> *client,
>   config.driver_data = act8865;
>   config.regmap = act8865->regmap;
> 
> - rdev[i] = devm_regulator_register(>dev,
> - _reg[i], );
> - if (IS_ERR(rdev[i])) {
> + rdev = devm_regulator_register(>dev, _reg[i],
> +);
> + if (IS_ERR(rdev)) {
>   dev_err(dev, "failed to register %s\n",
>   act8865_reg[id].name);
> - return PTR_ERR(rdev[i]);
> + return PTR_ERR(rdev);
>   }
>   }
> 
> --
> 1.8.1.2
> 
> 



[PATCH] net: phy: fix uninitalized WOL parameters in phy_ethtool_get_wol

2014-03-09 Thread Sebastian Hesselbarth
phy_ethtool_get_wol is a helper to get current WOL settings from
a phy device. When using this helper on a PHY without .get_wol
callback, struct ethtool_wolinfo is never set-up correctly and
may contain misleading information about WOL status.

To fix this, always zero relevant fields of struct ethtool_wolinfo
regardless of .get_wol callback availability.

Signed-off-by: Sebastian Hesselbarth 
---
Cc: David Miller 
Cc: Florian Fainelli 
Cc: net...@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/net/phy/phy.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 19c9eca0ef26..62a7cd401e1c 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -1092,6 +1092,7 @@ EXPORT_SYMBOL(phy_ethtool_set_wol);
 
 void phy_ethtool_get_wol(struct phy_device *phydev, struct ethtool_wolinfo 
*wol)
 {
+   wol->supported = wol->wolopts = 0;
if (phydev->drv->get_wol)
phydev->drv->get_wol(phydev, wol);
 }
-- 
1.8.5.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 1/5] bug: When !CONFIG_BUG, simplify WARN_ON_ONCE and family

2014-03-09 Thread Josh Triplett
When !CONFIG_BUG, WARN_ON and family become simple passthroughs of their
condition argument; however, WARN_ON_ONCE and family still have
conditions and a boolean to detect one-time invocation, even though the
warning they'd emit doesn't exist.  Make the existing definitions
conditional on CONFIG_BUG, and add definitions for !CONFIG_BUG that map
to the passthrough versions of WARN and WARN_ON.

This saves 4.4k on a minimized configuration (smaller than
allnoconfig), and 20.6k with defconfig plus CONFIG_BUG=n.

Signed-off-by: Josh Triplett 
---
v3: Patch unchanged from v2.

Andrew, can you please replace the entire v2 series currently in -mm
with this new series?

 include/asm-generic/bug.h | 57 +--
 1 file changed, 30 insertions(+), 27 deletions(-)

diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h
index 7d10f96..7ecd398 100644
--- a/include/asm-generic/bug.h
+++ b/include/asm-generic/bug.h
@@ -106,33 +106,6 @@ extern void warn_slowpath_null(const char *file, const int 
line);
unlikely(__ret_warn_on);\
 })
 
-#else /* !CONFIG_BUG */
-#ifndef HAVE_ARCH_BUG
-#define BUG() do {} while(0)
-#endif
-
-#ifndef HAVE_ARCH_BUG_ON
-#define BUG_ON(condition) do { if (condition) ; } while(0)
-#endif
-
-#ifndef HAVE_ARCH_WARN_ON
-#define WARN_ON(condition) ({  \
-   int __ret_warn_on = !!(condition);  \
-   unlikely(__ret_warn_on);\
-})
-#endif
-
-#ifndef WARN
-#define WARN(condition, format...) ({  \
-   int __ret_warn_on = !!(condition);  \
-   unlikely(__ret_warn_on);\
-})
-#endif
-
-#define WARN_TAINT(condition, taint, format...) WARN_ON(condition)
-
-#endif
-
 #define WARN_ON_ONCE(condition)({  \
static bool __section(.data.unlikely) __warned; \
int __ret_warn_once = !!(condition);\
@@ -163,6 +136,36 @@ extern void warn_slowpath_null(const char *file, const int 
line);
unlikely(__ret_warn_once);  \
 })
 
+#else /* !CONFIG_BUG */
+#ifndef HAVE_ARCH_BUG
+#define BUG() do {} while(0)
+#endif
+
+#ifndef HAVE_ARCH_BUG_ON
+#define BUG_ON(condition) do { if (condition) ; } while(0)
+#endif
+
+#ifndef HAVE_ARCH_WARN_ON
+#define WARN_ON(condition) ({  \
+   int __ret_warn_on = !!(condition);  \
+   unlikely(__ret_warn_on);\
+})
+#endif
+
+#ifndef WARN
+#define WARN(condition, format...) ({  \
+   int __ret_warn_on = !!(condition);  \
+   unlikely(__ret_warn_on);\
+})
+#endif
+
+#define WARN_ON_ONCE(condition) WARN_ON(condition)
+#define WARN_ONCE(condition, format...) WARN(condition, format)
+#define WARN_TAINT(condition, taint, format...) WARN(condition, format)
+#define WARN_TAINT_ONCE(condition, taint, format...) WARN(condition, format)
+
+#endif
+
 /*
  * WARN_ON_SMP() is for cases that the warning is either
  * meaningless for !SMP or may even cause failures.
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] net: phy: Add sysfs attribute to prevent PHY suspend

2014-03-09 Thread Sebastian Hesselbarth

On 03/10/2014 01:41 AM, David Miller wrote:

From: Sebastian Hesselbarth 
Date: Mon, 10 Mar 2014 01:37:32 +0100


The mechanism is manual, no automatic way to determine it.


We recognize BIOS and ACPI bugs and work around them, by looking at
version information and whatnot, so you really can't convince me that
something similar can't be done here perhaps in the platform code.


Hmm, if the is a way to determine the version of that particual u-boot
I'd be happy to exploit that information. But I honestly doubt that.
Compared to u-boot bootloader and kernel interaction, BIOS and ACPI
are well-defined protocols.

I personally, would prefer everybody should update his broken
bootloaders, but that will just not happen.

Anyway, at least for the two boards in question, we know a bootloader
workaround. The version does support user commands to re-enable the PHY
by writing the corresponding registers.

Unfortunately, the is a bug in phy_ethtool_get_wol that up to now,
prevents most PHYs (without .wol callbacks) from being suspended.
I wanted to get in a way to disable suspend before sending a fix.

If you are that against a sysfs knob, I guess, we will just see how
many more bootloaders are broken and some will not have a way to write
PHY registers.

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 net-next 1/3] filter: add Extended BPF interpreter and converter

2014-03-09 Thread Alexei Starovoitov
On Sun, Mar 9, 2014 at 3:00 PM, Daniel Borkmann  wrote:
> On 03/09/2014 06:08 PM, Alexei Starovoitov wrote:
>>
>> On Sun, Mar 9, 2014 at 5:29 AM, Daniel Borkmann 
>> wrote:
>>>
>>> On 03/09/2014 12:15 AM, Alexei Starovoitov wrote:


 Extended BPF extends old BPF in the following ways:
 - from 2 to 10 registers
 Original BPF has two registers (A and X) and hidden frame pointer.
 Extended BPF has ten registers and read-only frame pointer.
 - from 32-bit registers to 64-bit registers
 semantics of old 32-bit ALU operations are preserved via 32-bit
 subregisters
 - if (cond) jump_true; else jump_false;
 old BPF insns are replaced with:
 if (cond) jump_true; /* else fallthrough */
 - adds signed > and >= insns
 - 16 4-byte stack slots for register spill-fill replaced with
 up to 512 bytes of multi-use stack space
 - introduces bpf_call insn and register passing convention for zero
 overhead calls from/to other kernel functions (not part of this
 patch)
 - adds arithmetic right shift insn
 - adds swab32/swab64 insns
 - adds atomic_add insn
 - old tax/txa insns are replaced with 'mov dst,src' insn

 Extended BPF is designed to be JITed with one to one mapping, which
 allows GCC/LLVM backends to generate optimized BPF code that performs
 almost as fast as natively compiled code

 sk_convert_filter() remaps old style insns into extended:
 'sock_filter' instructions are remapped on the fly to
 'sock_filter_ext' extended instructions when
 sysctl net.core.bpf_ext_enable=1

 Old filter comes through sk_attach_filter() or
 sk_unattached_filter_create()
if (bpf_ext_enable) {
   convert to new
   sk_chk_filter() - check old bpf
   use sk_run_filter_ext() - new interpreter
} else {
   sk_chk_filter() - check old bpf
   if (bpf_jit_enable)
   use old jit
   else
   use sk_run_filter() - old interpreter
}

 sk_run_filter_ext() interpreter is noticeably faster
 than sk_run_filter() for two reasons:

 1.fall-through jumps
 Old BPF jump instructions are forced to go either 'true' or 'false'
 branch which causes branch-miss penalty.
 Extended BPF jump instructions have one branch and fall-through,
 which fit CPU branch predictor logic better.
 'perf stat' shows drastic difference for branch-misses.

 2.jump-threaded implementation of interpreter vs switch statement
 Instead of single tablejump at the top of 'switch' statement, GCC
 will
 generate multiple tablejump instructions, which helps CPU branch
 predictor

 Performance of two BPF filters generated by libpcap was measured
 on x86_64, i386 and arm32.

 fprog #1 is taken from Documentation/networking/filter.txt:
 tcpdump -i eth0 port 22 -dd

 fprog #2 is taken from 'man tcpdump':
 tcpdump -i eth0 'tcp port 22 and (((ip[2:2] - ((ip[0]&0xf)<<2)) -
  ((tcp[12]&0xf0)>>2)) != 0)' -dd

 Other libpcap programs have similar performance differences.

 Raw performance data from BPF micro-benchmark:
 SK_RUN_FILTER on same SKB (cache-hit) or 10k SKBs (cache-miss)
 time in nsec per call, smaller is better
 --x86_64--
fprog #1  fprog #1   fprog #2  fprog #2
cache-hit cache-miss cache-hit cache-miss
 old BPF 90   101   192   202
 ext BPF 3171   47 97
 old BPF jit 1234   17 44
 ext BPF jit TBD

 --i386--
fprog #1  fprog #1   fprog #2  fprog #2
cache-hit cache-miss cache-hit cache-miss
 old BPF107136  227   252
 ext BPF 40119   69   172

 --arm32--
fprog #1  fprog #1   fprog #2  fprog #2
cache-hit cache-miss cache-hit cache-miss
 old BPF202300  475   540
 ext BPF180270  330   470
 old BPF jit 26182   37   202
 new BPF jit TBD

 Tested with trinify BPF fuzzer

 Future work:

 0. add bpf/ebpf testsuite to tools/testing/selftests/net/bpf

 1. add extended BPF JIT for x86_64

 2. add inband old/new demux and extended BPF verifier, so that new
 programs
  can be loaded through old sk_attach_filter() and
 sk_unattached_filter_create()
  interfaces

 3. tracing filters systemtap-like with extended BPF

 4. OVS with extended BPF

 5. nftables with extended BPF

 Signed-off-by: Alexei Starovoitov 
 Acked-by: Hagen Paul Pfeifer 
 Reviewed-by: Daniel Borkmann 
>>>
>>>
>>>
>>> One more question or possible issue that came through my mind: When
>>> someone attaches a socket 

Re: [PATCH] net: phy: Add sysfs attribute to prevent PHY suspend

2014-03-09 Thread David Miller
From: Sebastian Hesselbarth 
Date: Mon, 10 Mar 2014 01:37:32 +0100

> The mechanism is manual, no automatic way to determine it.

We recognize BIOS and ACPI bugs and work around them, by looking at
version information and whatnot, so you really can't convince me that
something similar can't be done here perhaps in the platform code.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] net: phy: Add sysfs attribute to prevent PHY suspend

2014-03-09 Thread Sebastian Hesselbarth

On 03/10/2014 01:30 AM, David Miller wrote:

From: Sebastian Hesselbarth 
Date: Mon, 10 Mar 2014 00:25:24 +0100


There is no way to determine if a bootloader is broken or not. The
sysfs knob allows to provide a use case based decision. Of course,
we can invent some freaky device tree property but that the DT
maintainers will not like either.


My point is that whatever mechanism is used to "decide" that the sysfs
knob gets set, can also be used to "decide" that a DT property is
instantiated in the device tree.


The mechanism is manual, no automatic way to determine it. I understand
your point, but DT maintainers will argue here that DT is to describe HW
not SW. And a badly written bootloader initialization routine for a PHY
is SW.

Also, this will force us to maintain two sets of DT files for each
affected board: one for those with broken bootloader and one for those
with an updated, fixed bootloader. And of course, the broken bootloaders
are from pre-DT times and cannot even set that property but require the
user to pick the right DT.

If you are still against a sysfs knob, I see no way to provide a user
accessible way to prevent the PHY to be suspended. And the user is the
only reliable instance to decide not to suspend it.

Sebastian

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] net: phy: Add sysfs attribute to prevent PHY suspend

2014-03-09 Thread David Miller
From: Sebastian Hesselbarth 
Date: Mon, 10 Mar 2014 00:25:24 +0100

> There is no way to determine if a bootloader is broken or not. The
> sysfs knob allows to provide a use case based decision. Of course,
> we can invent some freaky device tree property but that the DT
> maintainers will not like either.

My point is that whatever mechanism is used to "decide" that the sysfs
knob gets set, can also be used to "decide" that a DT property is
instantiated in the device tree.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] drivers: clk: add samsung common clock config option

2014-03-09 Thread Tomasz Figa

Hi Pankaj,

On 26.02.2014 06:24, Pankaj Dubey wrote:

add samsung common clock config option and let ARCH_EXYNOS or ARCH_S3C
select this if they want to use samsung common clock infrastructure.

CC: Mike Turquette 
Signed-off-by: Pankaj Dubey 
---
  drivers/clk/Kconfig  |   10 ++
  drivers/clk/Makefile |2 +-
  2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/clk/Kconfig b/drivers/clk/Kconfig
index 7641965..d93a325 100644
--- a/drivers/clk/Kconfig
+++ b/drivers/clk/Kconfig
@@ -23,6 +23,16 @@ config COMMON_CLK
  menu "Common Clock Framework"
depends on COMMON_CLK

+config COMMON_CLK_SAMSUNG
+   bool "Clock driver for Samsung SoCs"
+   depends on ARCH_S3C64XX || ARCH_S3C24XX || ARCH_EXYNOS || ARM64
+   ---help---
+  Supports clocking on Exynos SoCs:
+ - Exynos5250, Exynos5420 board.
+ - Exynos4 boards.
+ - S3C2412, S3C2416, S3C2466 boards.
+ - S3C64XX boards.


I don't think listing the platforms here explicitly is a good idea, as 
this option shouldn't generally be user-visible (related platforms would 
not work without this option enabled) and adding support for every new 
SoC would require changing the help string.


I wonder if we really need this to be user-visible. What about moving it 
out of this menu, making the symbol select COMMON_CLK and let the 
platforms just select COMMON_CLK_SAMSUNG alone?


Best regards,
Tomasz
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/9 v7] clk: samsung exynos5250/5420: Add gate clock for SSS module

2014-03-09 Thread Tomasz Figa

On 17.02.2014 10:44, Naveen Krishna Chatradhi wrote:

This patch adds gating clock for SSS(Security SubSystem)
module on Exynos5250/5420.

Signed-off-by: Naveen Krishna Chatradhi 
Reviewed-by: Tomasz Figa 
TO: 
TO: Tomasz Figa 
CC: David S. Miller 
CC: Kukjin Kim 
CC: 
---
changes since v6:
None
changes since v5:
1. Added Reviewed-by: Tomasz Figa 

  .../devicetree/bindings/clock/exynos5250-clock.txt |1 +
  drivers/clk/samsung/clk-exynos5250.c   |1 +
  drivers/clk/samsung/clk-exynos5420.c   |4 
  include/dt-bindings/clock/exynos5250.h |1 +
  4 files changed, 7 insertions(+)


Applied to samsung-clk tree.

Best regards,
Tomasz
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] x86 fixes for v3.14-rc6 - bad patch dropped

2014-03-09 Thread H. Peter Anvin
Hi Linus,

The same collection of fixes except the broken NMI patch dropped.  I
will send a fixed version of that plus Suresh' FPU fix in a few days,
to get them some testing, plus I will be on a trip (part of why I got
unduly rushed this past Friday.  Sorry again for that.)

The following changes since commit 0414855fdc4a40da05221fc6062cccbc0c30f169:

  Linux 3.14-rc5 (2014-03-02 18:56:16 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
x86-urgent-for-linus-fixed

for you to fetch changes up to d4078e232267ff53f3b030b9698a3c001db4dbec:

  x86, trace: Further robustify CR2 handling vs tracing (2014-03-06 10:58:18 
-0800)


Borislav Petkov (1):
  x86/efi: Quirk out SGI UV

H. Peter Anvin (1):
  Merge tag 'efi-urgent' into x86/urgent

Jiri Olsa (1):
  x86, trace: Fix CR2 corruption when tracing page faults

Peter Zijlstra (1):
  x86, trace: Further robustify CR2 handling vs tracing

 arch/x86/include/asm/efi.h  |  1 +
 arch/x86/kernel/setup.c | 10 ++
 arch/x86/mm/fault.c | 47 +++--
 arch/x86/platform/efi/efi.c | 20 +++
 4 files changed, 56 insertions(+), 22 deletions(-)

diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 3d6b9f81cc68..acd86c850414 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -134,6 +134,7 @@ extern void efi_setup_page_tables(void);
 extern void __init old_map_region(efi_memory_desc_t *md);
 extern void __init runtime_code_page_mkexec(void);
 extern void __init efi_runtime_mkexec(void);
+extern void __init efi_apply_memmap_quirks(void);
 
 struct efi_setup_data {
u64 fw_vendor;
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 06853e670354..ce72964b2f46 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1239,14 +1239,8 @@ void __init setup_arch(char **cmdline_p)
register_refined_jiffies(CLOCK_TICK_RATE);
 
 #ifdef CONFIG_EFI
-   /* Once setup is done above, unmap the EFI memory map on
-* mismatched firmware/kernel archtectures since there is no
-* support for runtime services.
-*/
-   if (efi_enabled(EFI_BOOT) && !efi_is_native()) {
-   pr_info("efi: Setup done, disabling due to 32/64-bit 
mismatch\n");
-   efi_unmap_memmap();
-   }
+   if (efi_enabled(EFI_BOOT))
+   efi_apply_memmap_quirks();
 #endif
 }
 
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 6dea040cc3a1..a10c8c792161 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -1020,13 +1020,17 @@ static inline bool smap_violation(int error_code, 
struct pt_regs *regs)
  * This routine handles page faults.  It determines the address,
  * and the problem, and then passes it off to one of the appropriate
  * routines.
+ *
+ * This function must have noinline because both callers
+ * {,trace_}do_page_fault() have notrace on. Having this an actual function
+ * guarantees there's a function trace entry.
  */
-static void __kprobes
-__do_page_fault(struct pt_regs *regs, unsigned long error_code)
+static void __kprobes noinline
+__do_page_fault(struct pt_regs *regs, unsigned long error_code,
+   unsigned long address)
 {
struct vm_area_struct *vma;
struct task_struct *tsk;
-   unsigned long address;
struct mm_struct *mm;
int fault;
unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE;
@@ -1034,9 +1038,6 @@ __do_page_fault(struct pt_regs *regs, unsigned long 
error_code)
tsk = current;
mm = tsk->mm;
 
-   /* Get the faulting address: */
-   address = read_cr2();
-
/*
 * Detect and handle instructions that would cause a page fault for
 * both a tracked kernel page and a userspace page.
@@ -1248,32 +1249,50 @@ good_area:
up_read(>mmap_sem);
 }
 
-dotraplinkage void __kprobes
+dotraplinkage void __kprobes notrace
 do_page_fault(struct pt_regs *regs, unsigned long error_code)
 {
+   unsigned long address = read_cr2(); /* Get the faulting address */
enum ctx_state prev_state;
 
+   /*
+* We must have this function tagged with __kprobes, notrace and call
+* read_cr2() before calling anything else. To avoid calling any kind
+* of tracing machinery before we've observed the CR2 value.
+*
+* exception_{enter,exit}() contain all sorts of tracepoints.
+*/
+
prev_state = exception_enter();
-   __do_page_fault(regs, error_code);
+   __do_page_fault(regs, error_code, address);
exception_exit(prev_state);
 }
 
-static void trace_page_fault_entries(struct pt_regs *regs,
+#ifdef CONFIG_TRACING
+static void trace_page_fault_entries(unsigned long address, struct pt_regs 
*regs,
 unsigned long 

Re: [x86, vdso] BUG: unable to handle kernel paging request at d34bd000

2014-03-09 Thread H. Peter Anvin
On 03/09/2014 12:47 AM, Stefani Seibold wrote:
> 
> But let me ask an other question: Is the compat mode still needed
> anymore?
> 
> Since Lguest, XEN, OPLC and the reservetop kernel parameter will change
> the __FIXADDR_TOP, there is no fix place for the VDSO page. Also in the
> 32 bit emulation layer the address is not fix.
> 
> So all applications can fail when try directly access the VDSO page with
> a hard coded address 0xe000.
> 
> IMHO this is broken. So an other solution is to remove the whole VDSO
> compat code.
> 

Lguest, Xen, OLPC and reservetop are corner cases.  My understanding is
that at least one widely used distro actually cared about this, and
Linus especially is adamant that "we don't break userspace."

The dual vdso approach might be the best bet, for the cases where
compatibility is even possible.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] net: sched: dev_deactivate_many(): use msleep(1) instead of yield() to wait for outstanding qdisc_run calls

2014-03-09 Thread Stanislav Meduna
On 09.03.2014 23:53, David Miller wrote:

> To me it means "I've got nothing to do if other tasks want to run right
> now"  Yes, I even see it having this meaning when an RT task executes
> it.

http://www.kernel.org/doc/htmldocs/device-drivers/API-yield.html
lists this exact "while (!event) yield;" pattern as a broken usage
and states "Never use yield as a progress guarantee!!".

> How else can you interpret the intent above?

IMNSHO there is no way to make the yield() honor this intent if it is
called from a SCHED_FIFO or SCHED_RR task and the task unblocking
it is running at a lower priority. On the PREEMPT_RT systems where
the interrupt handlers are threads an application thread running
at higher priority than some interrupt handlers is a common situation.

The semantics of the scheduler here is "run the highest priority
runnable task until it blocks (FIFO) or its time slice is over
and there are more with the _same_ priority (RR)". This scenario
then collapses into a busy loop never making progress.

> If you change it to msleep(1), you're assigning an extra completely
> arbitrary time limit to the yield.  The code doesn't want to sleep
> for 1ms, that's not what it's asking for.

The problem is that there is no way to formulate what it is asking
for in scheduler terms only.

Regards
-- 
 Stano

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [qemu64,+smep,+smap] WARNING: CPU: 1 PID: 0 at arch/x86/kernel/cpu/amd.c:220 init_amd()

2014-03-09 Thread H. Peter Anvin
On 03/09/2014 11:07 AM, Paolo Bonzini wrote:
> 
> We really should give a loud warning if qemu64 is used with KVM.  It
> makes no sense with KVM, even less than it does with dynamic translation.
> 

Well, this is dynamic translation.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [x86, vdso] BUG: unable to handle kernel paging request at d34bd000

2014-03-09 Thread H. Peter Anvin
On 03/09/2014 12:08 AM, Stefani Seibold wrote:
> 
> This was not addressed to you, it was addressed to the x86 intel kernel
> developers to do more testing, since this piece of code has so many side
> effects. I apologizes this miss understanding.
> 

I think you're misunderstanding.

We cannot debug every single contributors' code for them.  There isn't
enough of us to go around.  We have in fact stretched well beyond the
point which we usually can accommodate for this particular patchset.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] staging: cxt1e1: remove redundant memset() call

2014-03-09 Thread Daeseok Youn

The name array doens't need to set to 0. Because
sprintf/snprintf adds a terminating '\0'.

And also it doesn't need to assign name array
address to np pointer.

Signed-off-by: Daeseok Youn 
---
 drivers/staging/cxt1e1/linux.c |8 +++-
 1 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/cxt1e1/linux.c b/drivers/staging/cxt1e1/linux.c
index 5d7ddd4..efd3eb8 100644
--- a/drivers/staging/cxt1e1/linux.c
+++ b/drivers/staging/cxt1e1/linux.c
@@ -204,15 +204,13 @@ status_t
 c4_wq_port_init(mpi_t *pi)
 {
 
-   charname[16], *np;  /* NOTE: name of the queue limited by system
+   charname[16];  /* NOTE: name of the queue limited by system
 * to 10 characters */
-
if (pi->wq_port)
return 0;   /* already initialized */
 
-   np = name;
-   memset(name, 0, 16);
-   sprintf(np, "%s%d", pi->up->devname, pi->portnum); /* IE pmcc4-01) */
+   /* IE pmcc4-01 */
+   snprintf(name, sizeof(name), "%s%d", pi->up->devname, pi->portnum);
 
 #ifdef RLD_RESTART_DEBUG
pr_info(">> %s: creating workqueue <%s> for Port %d.\n",
-- 
1.7.4.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] staging: cxtie1: remove unneeded mkret() calls

2014-03-09 Thread Daeseok Youn

The mkret() change a value of error from positive to
negative. This patch is modified to return negative value
when it failed. It doesn't need to call with function
for changing from positive to negative.

Signed-off-by: Daeseok Youn 
---
 drivers/staging/cxt1e1/linux.c |   72 +--
 drivers/staging/cxt1e1/musycc.c|2 +-
 drivers/staging/cxt1e1/pmcc4_drv.c |   40 ++--
 3 files changed, 48 insertions(+), 66 deletions(-)

diff --git a/drivers/staging/cxt1e1/linux.c b/drivers/staging/cxt1e1/linux.c
index 549efd1..5d7ddd4 100644
--- a/drivers/staging/cxt1e1/linux.c
+++ b/drivers/staging/cxt1e1/linux.c
@@ -145,16 +145,6 @@ get_hdlc_name(hdlc_device *hdlc)
return dev->name;
 }
 
-
-static status_t
-mkret(int bsd)
-{
-   if (bsd > 0)
-   return -bsd;
-   else
-   return bsd;
-}
-
 /***/
 #include 
 
@@ -292,8 +282,8 @@ chan_open(struct net_device *ndev)
}
 
ret = c4_chan_up(priv->ci, priv->channum);
-   if (ret)
-   return -ret;
+   if (ret < 0)
+   return ret;
try_module_get(THIS_MODULE);
netif_start_queue(ndev);
return 0;   /* no error = success */
@@ -523,8 +513,8 @@ do_get_port(struct net_device *ndev, void *data)
if (!ci)
return -EINVAL; /* get card info */
 
-   ret = mkret(c4_get_port(ci, pp.portnum));
-   if (ret)
+   ret = c4_get_port(ci, pp.portnum);
+   if (ret < 0)
return ret;
if (copy_to_user(data, >port[pp.portnum].p,
 sizeof(struct sbecom_port_param)))
@@ -551,7 +541,7 @@ do_set_port(struct net_device *ndev, void *data)
return -ENXIO;
 
memcpy(>port[pp.portnum].p, , sizeof(struct sbecom_port_param));
-   return mkret(c4_set_port(ci, pp.portnum));
+   return c4_set_port(ci, pp.portnum);
 }
 
 /* work the port loopback mode as per directed */
@@ -566,7 +556,7 @@ do_port_loop(struct net_device *ndev, void *data)
ci = get_ci_by_dev(ndev);
if (!ci)
return -EINVAL;
-   return mkret(c4_loop_port(ci, pp.portnum, pp.port_mode));
+   return c4_loop_port(ci, pp.portnum, pp.port_mode);
 }
 
 /* set the specified register with the given value / or just read it */
@@ -582,8 +572,8 @@ do_framer_rw(struct net_device *ndev, void *data)
ci = get_ci_by_dev(ndev);
if (!ci)
return -EINVAL;
-   ret = mkret(c4_frame_rw(ci, ));
-   if (ret)
+   ret = c4_frame_rw(ci, );
+   if (ret < 0)
return ret;
if (copy_to_user(data, , sizeof(struct sbecom_port_param)))
return -EFAULT;
@@ -603,7 +593,8 @@ do_pld_rw(struct net_device *ndev, void *data)
ci = get_ci_by_dev(ndev);
if (!ci)
return -EINVAL;
-   ret = mkret(c4_pld_rw(ci, ));
+
+   ret = c4_pld_rw(ci, );
if (ret)
return ret;
if (copy_to_user(data, , sizeof(struct sbecom_port_param)))
@@ -624,8 +615,8 @@ do_musycc_rw(struct net_device *ndev, void *data)
ci = get_ci_by_dev(ndev);
if (!ci)
return -EINVAL;
-   ret = mkret(c4_musycc_rw(ci, ));
-   if (ret)
+   ret = c4_musycc_rw(ci, );
+   if (ret < 0)
return ret;
if (copy_to_user(data, , sizeof(struct c4_musycc_param)))
return -EFAULT;
@@ -642,8 +633,8 @@ do_get_chan(struct net_device *ndev, void *data)
sizeof(struct sbecom_chan_param)))
return -EFAULT;
 
-   ret = mkret(c4_get_chan(cp.channum, ));
-   if (ret)
+   ret = c4_get_chan(cp.channum, );
+   if (ret < 0)
return ret;
 
if (copy_to_user(data, , sizeof(struct sbecom_chan_param)))
@@ -655,7 +646,6 @@ static status_t
 do_set_chan(struct net_device *ndev, void *data)
 {
struct sbecom_chan_param cp;
-   int ret;
ci_t   *ci;
 
if (copy_from_user(, data, sizeof(struct sbecom_chan_param)))
@@ -663,13 +653,7 @@ do_set_chan(struct net_device *ndev, void *data)
ci = get_ci_by_dev(ndev);
if (!ci)
return -EINVAL;
-   switch (ret = mkret(c4_set_chan(cp.channum, )))
-   {
-   case 0:
-   return 0;
-   default:
-   return ret;
-   }
+   return c4_set_chan(cp.channum, );
 }
 
 static status_t
@@ -688,8 +672,8 @@ do_create_chan(struct net_device *ndev, void *data)
dev = create_chan(ndev, ci, );
if (!dev)
return -EBUSY;
-   ret = mkret(c4_new_chan(ci, cp.port, cp.channum, dev));
-   if (ret) {
+   ret = c4_new_chan(ci, cp.port, cp.channum, dev);
+   if (ret < 0) {
/* needed due to Ioctl calling sequence */
rtnl_unlock();

Generic HSI client DT bindings

2014-03-09 Thread Sebastian Reichel
Hi,

I'm currently working on Device Tree support for the HSI subsystem
to get the Nokia N900 modem working in the mainline kernel. I guess
the key question for the binding has been asked by Mark:

Mark Rutland  wrote [0]:
> Does HSI have an addressing scheme, or does each port
> have a single device?

It's easy to answer the question in an abstract way: A HSI link
connects two processors with each other providing multiple logical
channels. The problem is, that only the physical layer is specified
AFAIK (I do not have access to HSI specification, so I do not know
for sure). Thus the exact usage of the channels is vendor specific.

For the Nokia N900 modem the following one of the following bindings
seem senseful to me:

= Variant A =
hsi-port {
/* some nodes describing the port */

n900_modem: client-device {
compatible = "nokia,n900-modem";
reg = <0>, <1>, <2>, <3>;
reg-names = "mcsaab-control",
"speech-control",
"mcsaab-data",
"speech-data";
hsi-mode = "stream";
hsi-speed-kbps = <55000>;
hsi-flow = "synchronized";
hsi-arb-mode = "round-robin";
};
};
#

= Variant B =
hsi-port {
/* some nodes describing the port */

cmt_mcsaab: client-device@0 {
compatible = "nokia,mcsaab-protocol";
reg = <0>, <2>;
reg-names = "mcsaab-control",
"mcsaab-data";
hsi-mode = "stream";
hsi-channels = <4>;
hsi-speed-kbps = <55000>;
hsi-flow = "synchronized";
hsi-arb-mode = "round-robin";
};

cmt_speech: client-device@1 {
compatible = "nokia,cmt-speech";
reg = <1>, <4>;
reg-names = "speech-control",
"speech-data";
hsi-mode = "stream";
hsi-channels = <4>;
hsi-speed-kbps = <55000>;
hsi-flow = "synchronized";
hsi-arb-mode = "round-robin";
};
};
#

Both bindings are simplified and do not map all of the hardware's
capabilities: Some settings can be configured differently for RX and
TX. It should be easy to extend the support for e.g. hsi-mode-rx
and hsi-mode-tx once this is needed. For devices configuring them
the same having just one node seems better to me (I assume almost
all devices want them to be the same).

As far as I can see only ST-Erricson has also HSI clients prepared
for the mainline kernel. It would be nice to get some feedback from
you. If you know more HSI users, which are not yet Cc'd please feel
free to point them here.

[0] http://article.gmane.org/gmane.linux.kernel/1654456

-- Sebastian


signature.asc
Description: Digital signature


Re: [PATCH] net: sched: dev_deactivate_many(): use msleep(1) instead of yield() to wait for outstanding qdisc_run calls

2014-03-09 Thread David Lang

On Sun, 9 Mar 2014, Ben Hutchings wrote:


On Sun, 2014-03-09 at 18:53 -0400, David Miller wrote:

From: Ben Hutchings 
Date: Sun, 09 Mar 2014 19:09:20 +


On Thu, 2014-03-06 at 16:06 -0500, David Miller wrote:

From: Marc Kleine-Budde 
Date: Wed,  5 Mar 2014 00:49:47 +0100


@@ -839,7 +839,7 @@ void dev_deactivate_many(struct list_head *head)
  /* Wait for outstanding qdisc_run calls. */
  list_for_each_entry(dev, head, unreg_list)
  while (some_qdisc_is_busy(dev))
- yield();
+ msleep(1)
 }


I don't understand this.

yield() should really _mean_ yield.

The intent of a yield() call, like this one here, is unambiguously
that the current thread cannot do anything until some other thread
gets onto the cpu and makes forward progress.

Therefore it should allow lower priority threads to run, not just
equal or higher priority ones.


Until when?

yield() is not a sensible operation in a preemptive multitasking system,
regardless of RT.


To me it means "I've got nothing to do if other tasks want to run right
now"  Yes, I even see it having this meaning when an RT task executes
it.

How else can you interpret the intent above?


The problem is that 'I've got nothing to do ... now' is information
about a *point* in time, which gives the scheduler no clue as to when
this task might be ready again.  So far as task *state* goes, it never
ceases to be ready.


If you change it to msleep(1), you're assigning an extra completely
arbitrary time limit to the yield.  The code doesn't want to sleep
for 1ms, that's not what it's asking for.

[...]

I think you want to give up a 'time slice' to any task that's available.
But that's not a meaningful concept for all schedulers.  If your task is
highest priority and is ready, it must run.  You could drop priority
temporarily, but then you need it to somehow be bumped up again at some
time in the future.  Well, sleeping effectively does that.

I do understand that unconditionally sleeping also isn't ideal - the
task that unblocks this one might already be running on another CPU and
able to finish sooner than the sleep timeout.

Maybe the answer is a yield_for(time) which sleeps for the given time or
until there's an idle CPU, whichever is sooner.  But I don't know how
hard that would be to implement or how widely useful it would be.


what is msleep(0) defined to do? would it be any better?

David Lang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] net: phy: Add sysfs attribute to prevent PHY suspend

2014-03-09 Thread Sebastian Hesselbarth

On 03/10/2014 12:12 AM, David Miller wrote:

From: Sebastian Hesselbarth 
Date: Fri,  7 Mar 2014 12:34:52 +0100


commit 1211ce53077164e0d34641d0ca5fb4d4a7574498
   ("net: phy: resume/suspend PHYs on attach/detach")
introduced a feature to suspend PHYs when entering halted state.

Unfortunately, not all bootloaders properly power-up PHYs on reset
and fail to access ethernet because the PHY is still powered down.

Therefore, this adds code and documentation for a sysfs attribute to
allow/deny PHYs to be suspended on a per-PHY basis. Disabling that
attribute prevents PHYs from being suspended when entering halted state.

Signed-off-by: Sebastian Hesselbarth 
Reported-by: Andrew Lunn 


I know you won't like what I have to say, but I want to see a solution
without this sysfs knob.

First of all, you obviously have a way to end up having the sysfs knob
get set on the appropriate systems.

Therefore, you obviously have some way to propagate the same piece of
information into the kernel somehow during boot time.

For example, via a device tree property or similar.


There is no way to determine if a bootloader is broken or not. The
sysfs knob allows to provide a use case based decision. Of course, we
can invent some freaky device tree property but that the DT maintainers
will not like either.

This is not an issue with broken HW but SW. The PHY is fine, you can
suspend and resume it perfectly in Linux. But the bootloader fails to
properly initialize it on a warm boot. You could update the bootloader
and the issue disappears.


Please pursue an avenue such as that.  This sysfs thing, it's a user
facing interface we'd have to live with forever.


If you want me to try a devicetree property for the issue, we also
create ABI and we will have to live with it forever.

I am open for suggestions, but I have a bad feeling about a "broken
bootloader" DT property.

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] net: sched: dev_deactivate_many(): use msleep(1) instead of yield() to wait for outstanding qdisc_run calls

2014-03-09 Thread Ben Hutchings
On Sun, 2014-03-09 at 18:53 -0400, David Miller wrote:
> From: Ben Hutchings 
> Date: Sun, 09 Mar 2014 19:09:20 +
> 
> > On Thu, 2014-03-06 at 16:06 -0500, David Miller wrote:
> >> From: Marc Kleine-Budde 
> >> Date: Wed,  5 Mar 2014 00:49:47 +0100
> >> 
> >> > @@ -839,7 +839,7 @@ void dev_deactivate_many(struct list_head *head)
> >> >   /* Wait for outstanding qdisc_run calls. */
> >> >   list_for_each_entry(dev, head, unreg_list)
> >> >   while (some_qdisc_is_busy(dev))
> >> > - yield();
> >> > + msleep(1)
> >> >  }
> >> 
> >> I don't understand this.
> >> 
> >> yield() should really _mean_ yield.
> >> 
> >> The intent of a yield() call, like this one here, is unambiguously
> >> that the current thread cannot do anything until some other thread
> >> gets onto the cpu and makes forward progress.
> >>
> >> Therefore it should allow lower priority threads to run, not just
> >> equal or higher priority ones.
> > 
> > Until when?
> > 
> > yield() is not a sensible operation in a preemptive multitasking system,
> > regardless of RT.
> 
> To me it means "I've got nothing to do if other tasks want to run right
> now"  Yes, I even see it having this meaning when an RT task executes
> it.
> 
> How else can you interpret the intent above?

The problem is that 'I've got nothing to do ... now' is information
about a *point* in time, which gives the scheduler no clue as to when
this task might be ready again.  So far as task *state* goes, it never
ceases to be ready.

> If you change it to msleep(1), you're assigning an extra completely
> arbitrary time limit to the yield.  The code doesn't want to sleep
> for 1ms, that's not what it's asking for.
[...]

I think you want to give up a 'time slice' to any task that's available.
But that's not a meaningful concept for all schedulers.  If your task is
highest priority and is ready, it must run.  You could drop priority
temporarily, but then you need it to somehow be bumped up again at some
time in the future.  Well, sleeping effectively does that.

I do understand that unconditionally sleeping also isn't ideal - the
task that unblocks this one might already be running on another CPU and
able to finish sooner than the sleep timeout.

Maybe the answer is a yield_for(time) which sleeps for the given time or
until there's an idle CPU, whichever is sooner.  But I don't know how
hard that would be to implement or how widely useful it would be.

Ben.

-- 
Ben Hutchings
I say we take off; nuke the site from orbit.  It's the only way to be sure.


signature.asc
Description: This is a digitally signed message part


Re: [PATCH] net: phy: Add sysfs attribute to prevent PHY suspend

2014-03-09 Thread David Miller
From: Sebastian Hesselbarth 
Date: Fri,  7 Mar 2014 12:34:52 +0100

> commit 1211ce53077164e0d34641d0ca5fb4d4a7574498
>   ("net: phy: resume/suspend PHYs on attach/detach")
> introduced a feature to suspend PHYs when entering halted state.
> 
> Unfortunately, not all bootloaders properly power-up PHYs on reset
> and fail to access ethernet because the PHY is still powered down.
> 
> Therefore, this adds code and documentation for a sysfs attribute to
> allow/deny PHYs to be suspended on a per-PHY basis. Disabling that
> attribute prevents PHYs from being suspended when entering halted state.
> 
> Signed-off-by: Sebastian Hesselbarth 
> Reported-by: Andrew Lunn 

I know you won't like what I have to say, but I want to see a solution
without this sysfs knob.

First of all, you obviously have a way to end up having the sysfs knob
get set on the appropriate systems.

Therefore, you obviously have some way to propagate the same piece of
information into the kernel somehow during boot time.

For example, via a device tree property or similar.

Please pursue an avenue such as that.  This sysfs thing, it's a user
facing interface we'd have to live with forever.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] netlink: switch net_ns only if net is not init_net

2014-03-09 Thread David Miller
From: Gu Zheng 
Date: Fri, 07 Mar 2014 18:47:30 +0800

> Many netlink users create netlink sock in the init_net, and the
> switching nes_ns(init_net-->net) is needless in this case. So here
> we add a pre-check to avoid this.
> 
> Signed-off-by: Gu Zheng 

This check is more appropriately placed into sk_change_net() itself.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net-next v2 10/13] r8152: support IPv6

2014-03-09 Thread David Miller
From: Ben Hutchings 
Date: Sun, 09 Mar 2014 19:47:55 +

> On Wed, 2014-03-05 at 14:49 +0800, Hayes Wang wrote:
>> Support hw IPv6 checksum for TCP and UDP packets.
>> 
>> Note that the hw has the limitation of the range of the transport
>> offset. Besides, the TCP Pseudo Header of the IPv6 TSO of the hw
>> bases on the Microsoft document which excludes the packet length.
>> 
>> Signed-off-by: Hayes Wang 
>> ---
>>  drivers/net/usb/r8152.c | 107 
>> ++--
>>  1 file changed, 104 insertions(+), 3 deletions(-)
>> 
>> diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
>> index 8f6d0f8..8e208f30 100644
>> --- a/drivers/net/usb/r8152.c
>> +++ b/drivers/net/usb/r8152.c
> [...]
>> +static int msdn_giant_send_check(struct sk_buff *skb)
>> +{
>> +const struct ipv6hdr *ipv6h;
>> +struct tcphdr *th;
>> +
>> +ipv6h = ipv6_hdr(skb);
>> +th = tcp_hdr(skb);
>> +
>> +th->check = 0;
>> +th->check = ~tcp_v6_check(0, >saddr, >daddr, 0);
> [...]
> 
> I think you need to call skb_cow_head() before editing the header here.

This made me notice that several drivers open-code this:

if (skb_header_cloned(skb) &&
pskb_expand_head(skb, 0, 0, GFP_ATOMIC))
goto drop;

If someone is looking for a quick cleanup, transforming these
to use skb_cow_head() would be nice.  That way other driver
authors will be less likely to copy the expanded code.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] net: sched: dev_deactivate_many(): use msleep(1) instead of yield() to wait for outstanding qdisc_run calls

2014-03-09 Thread David Miller
From: Ben Hutchings 
Date: Sun, 09 Mar 2014 19:09:20 +

> On Thu, 2014-03-06 at 16:06 -0500, David Miller wrote:
>> From: Marc Kleine-Budde 
>> Date: Wed,  5 Mar 2014 00:49:47 +0100
>> 
>> > @@ -839,7 +839,7 @@ void dev_deactivate_many(struct list_head *head)
>> >   /* Wait for outstanding qdisc_run calls. */
>> >   list_for_each_entry(dev, head, unreg_list)
>> >   while (some_qdisc_is_busy(dev))
>> > - yield();
>> > + msleep(1)
>> >  }
>> 
>> I don't understand this.
>> 
>> yield() should really _mean_ yield.
>> 
>> The intent of a yield() call, like this one here, is unambiguously
>> that the current thread cannot do anything until some other thread
>> gets onto the cpu and makes forward progress.
>>
>> Therefore it should allow lower priority threads to run, not just
>> equal or higher priority ones.
> 
> Until when?
> 
> yield() is not a sensible operation in a preemptive multitasking system,
> regardless of RT.

To me it means "I've got nothing to do if other tasks want to run right
now"  Yes, I even see it having this meaning when an RT task executes
it.

How else can you interpret the intent above?

If you change it to msleep(1), you're assigning an extra completely
arbitrary time limit to the yield.  The code doesn't want to sleep
for 1ms, that's not what it's asking for.

On the other hand, I do completely agree with other replies stating
that it would be better if we found a way for this code to wait on
something explicitly via a wait queue.

Unfortunately, that's undesirable from another perspective, in that it
would probably require making the fast paths of the qdiscs more
expensive.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCHv2 2/6] HSI: Add function to register HSI clients from DT

2014-03-09 Thread Sebastian Reichel
Add new method hsi_add_clients_from_dt, which can be used
to initialize HSI clients from a device tree node.

The function currently only registers the generic hsi_char
device, which is the only one available in the Linux kernel.
Support for loading generic hsi clients will be added once
a common binding has been specified for them.

Signed-off-by: Sebastian Reichel 
---
 drivers/hsi/hsi.c   | 32 +++-
 include/linux/hsi/hsi.h |  2 ++
 2 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/drivers/hsi/hsi.c b/drivers/hsi/hsi.c
index 749f7b5..6fde590 100644
--- a/drivers/hsi/hsi.c
+++ b/drivers/hsi/hsi.c
@@ -26,8 +26,14 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include "hsi_core.h"
 
+static struct hsi_board_info hsi_char_dev_info = {
+   .name = "hsi_char",
+};
+
 static ssize_t modalias_show(struct device *dev,
struct device_attribute *a __maybe_unused, char *buf)
 {
@@ -50,7 +56,13 @@ static int hsi_bus_uevent(struct device *dev, struct 
kobj_uevent_env *env)
 
 static int hsi_bus_match(struct device *dev, struct device_driver *driver)
 {
-   return strcmp(dev_name(dev), driver->name) == 0;
+   if (of_driver_match_device(dev, driver))
+   return true;
+
+   if (strcmp(dev_name(dev), driver->name) == 0)
+   return true;
+
+   return false;
 }
 
 static struct bus_type hsi_bus_type = {
@@ -101,6 +113,24 @@ static void hsi_scan_board_info(struct hsi_controller *hsi)
}
 }
 
+static void hsi_add_client_from_dt(struct hsi_port *port,
+   struct device_node *client)
+{
+   /* TODO */
+}
+
+void hsi_add_clients_from_dt(struct hsi_port *port, struct device_node 
*clients)
+{
+   struct device_node *child;
+
+   /* register hsi-char device */
+   hsi_new_client(port, _char_dev_info);
+
+   for_each_available_child_of_node(clients, child)
+   hsi_add_client_from_dt(port, child);
+}
+EXPORT_SYMBOL_GPL(hsi_add_clients_from_dt);
+
 static int hsi_remove_client(struct device *dev, void *data __maybe_unused)
 {
device_unregister(dev);
diff --git a/include/linux/hsi/hsi.h b/include/linux/hsi/hsi.h
index 0dca785..fb07339 100644
--- a/include/linux/hsi/hsi.h
+++ b/include/linux/hsi/hsi.h
@@ -282,6 +282,8 @@ struct hsi_controller *hsi_alloc_controller(unsigned int 
n_ports, gfp_t flags);
 void hsi_put_controller(struct hsi_controller *hsi);
 int hsi_register_controller(struct hsi_controller *hsi);
 void hsi_unregister_controller(struct hsi_controller *hsi);
+void hsi_add_clients_from_dt(struct hsi_port *port,
+   struct device_node *clients);
 
 static inline void hsi_controller_set_drvdata(struct hsi_controller *hsi,
void *data)
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCHv2 1/6] Documentation: HSI: Add some general description for the HSI subsystem

2014-03-09 Thread Sebastian Reichel
Add a document, which gives a rough introduction about what HSI
is and how its handled by the Linux kernel.

Signed-off-by: Sebastian Reichel 
---
 Documentation/hsi.txt | 75 +++
 1 file changed, 75 insertions(+)
 create mode 100644 Documentation/hsi.txt

diff --git a/Documentation/hsi.txt b/Documentation/hsi.txt
new file mode 100644
index 000..6ac6cd5
--- /dev/null
+++ b/Documentation/hsi.txt
@@ -0,0 +1,75 @@
+HSI - High-speed Synchronous Serial Interface
+
+1. Introduction
+~~~
+
+High Speed Syncronous Interface (HSI) is a fullduplex, low latency protocol,
+that is optimized for die-level interconnect between an Application Processor
+and a Baseband chipset. It has been specified by the MIPI alliance in 2003 and
+implemented by multiple vendors since then.
+
+The HSI interface supports full duplex communication over multiple channels
+(typically 8) and is capable of reaching speeds up to 200 Mbit/s.
+
+The serial protocol uses two signals, DATA and FLAG as combined data and clock
+signals and an additional READY signal for flow control. An additional WAKE
+signal can be used to wakeup the chips from standby modes. The signals are
+commonly prefixed by AC for signals going from the application die to the
+cellular die and CA for signals going the other way around.
+
+++ +---+
+|  Cellular  | |  Application  |
+|Die | |  Die  |
+|| - - - - - - CAWAKE - - - - - - >|   |
+|   T| CADATA >|R  |
+|   X| CAFLAG >|X  |
+||<--- ACREADY |   |
+|| |   |
+|| |   |
+||< - - - - -  ACWAKE - - - - - - -|   |
+|   R|<--- ACDATA -|T  |
+|   X|<--- ACFLAG -|X  |
+|| CAREADY --->|   |
+|| |   |
+|| |   |
+++ +---+
+
+2. HSI Subsystem in Linux
+~
+
+In the Linux kernel the hsi subsystem is supposed to be used for HSI devices.
+The hsi subsystem contains drivers for hsi controllers including support for
+multi-port controllers and provides a generic API for using the HSI ports.
+
+It also contains HSI client drivers, which make use of the generic API to
+implement a protocol used on the HSI interface. These client drivers can
+use an arbitrary number of channels.
+
+3. hsi-char Device
+~~
+
+Each port automatically registers a generic client driver called hsi_char,
+which provides a charecter device for userspace representing the HSI port.
+It can be used to communicate via HSI from userspace. Userspace may
+configure the hsi_char device using the following ioctl commands:
+
+* HSC_RESET:
+ - flush the HSI port
+
+* HSC_SET_PM
+ - enable or disable the client.
+
+* HSC_SEND_BREAK
+ - send break
+
+* HSC_SET_RX
+ - set RX configuration
+
+* HSC_GET_RX
+ - get RX configuration
+
+* HSC_SET_TX
+ - set TX configuration
+
+* HSC_GET_TX
+ - get TX configuration
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCHv2 3/6] HSI: method to unregister clients from an hsi port

2014-03-09 Thread Sebastian Reichel
This exports a method to unregister all clients from
an hsi port.

Signed-off-by: Sebastian Reichel 
---
 drivers/hsi/hsi.c   | 10 ++
 include/linux/hsi/hsi.h |  1 +
 2 files changed, 11 insertions(+)

diff --git a/drivers/hsi/hsi.c b/drivers/hsi/hsi.c
index 6fde590..098cc3a 100644
--- a/drivers/hsi/hsi.c
+++ b/drivers/hsi/hsi.c
@@ -160,6 +160,16 @@ static void hsi_port_release(struct device *dev)
 }
 
 /**
+ * hsi_unregister_port - Unregister an HSI port
+ * @port: The HSI port to unregister
+ */
+void hsi_port_unregister_clients(struct hsi_port *port)
+{
+   device_for_each_child(>device, NULL, hsi_remove_client);
+}
+EXPORT_SYMBOL_GPL(hsi_port_unregister_clients);
+
+/**
  * hsi_unregister_controller - Unregister an HSI controller
  * @hsi: The HSI controller to register
  */
diff --git a/include/linux/hsi/hsi.h b/include/linux/hsi/hsi.h
index fb07339..89cc6f2 100644
--- a/include/linux/hsi/hsi.h
+++ b/include/linux/hsi/hsi.h
@@ -284,6 +284,7 @@ int hsi_register_controller(struct hsi_controller *hsi);
 void hsi_unregister_controller(struct hsi_controller *hsi);
 void hsi_add_clients_from_dt(struct hsi_port *port,
struct device_node *clients);
+void hsi_port_unregister_clients(struct hsi_port *port);
 
 static inline void hsi_controller_set_drvdata(struct hsi_controller *hsi,
void *data)
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCHv2 4/6] HSI: hsi-char: fix driver for multiport scenarios

2014-03-09 Thread Sebastian Reichel
Fix return code check of alloc_chrdev_region, which
returns 0 on success.

Signed-off-by: Sebastian Reichel 
---
 drivers/hsi/clients/hsi_char.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/hsi/clients/hsi_char.c b/drivers/hsi/clients/hsi_char.c
index e61e5f9..3073320 100644
--- a/drivers/hsi/clients/hsi_char.c
+++ b/drivers/hsi/clients/hsi_char.c
@@ -705,7 +705,7 @@ static int hsc_probe(struct device *dev)
if (!hsc_major) {
ret = alloc_chrdev_region(_dev, hsc_baseminor,
HSC_DEVS, devname);
-   if (ret > 0)
+   if (ret == 0)
hsc_major = MAJOR(hsc_dev);
} else {
hsc_dev = MKDEV(hsc_major, hsc_baseminor);
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCHv2 6/6] Documentation: DT: omap-ssi binding documentation

2014-03-09 Thread Sebastian Reichel
Create device tree binding documentation for
OMAP Synchronous Serial Interface (SSI) device.

Signed-off-by: Sebastian Reichel 
---
 Documentation/devicetree/bindings/hsi/omap-ssi.txt | 82 ++
 1 file changed, 82 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/hsi/omap-ssi.txt

diff --git a/Documentation/devicetree/bindings/hsi/omap-ssi.txt 
b/Documentation/devicetree/bindings/hsi/omap-ssi.txt
new file mode 100644
index 000..9649582
--- /dev/null
+++ b/Documentation/devicetree/bindings/hsi/omap-ssi.txt
@@ -0,0 +1,82 @@
+OMAP SSI controller bindings
+
+Required properties:
+- compatible:  Should include "ti,omap3-ssi".
+- reg-names:   Contains the values "sys" and "gdd".
+- reg: Contains a register specifier for each entry in
+   reg-names.
+- interrupt-names:  Contains the value "gdd_mpu".
+- interrupts:  Contains interrupt information for each entry in
+   interrupt-names.
+- ranges:  Represents the bus address mapping between the main
+   controller node and the child nodes below.
+- clocks:  Contains clock specifiers for each entry in
+clock-names.
+- clock-names: Must include the following entries:
+  "ssi_ssr_fck": The OMAP clock of that name
+  "ssi_sst_fck": The OMAP clock of that name
+  "ssi_ick": The OMAP clock of that name
+- #address-cells:  Should be set to <1>
+- #size-cells: Should be set to <1>
+
+Each port is represented as a sub-node of the ti,omap3-ssi device.
+
+Required Port sub-node properties:
+- compatible:  Should be set to the following value
+ti,omap3-ssi-port (applicable to OMAP34xx devices)
+- reg-names:   Contains the values "rx" and "tx".
+- reg: Contains a register specifier for each entry in
+   reg-names.
+- interrupt-parent Should be a phandle for the interrupt controller
+- interrupt-names: Contains the values "mpu_irq0" and "mpu_irq1".
+- interrupts:  Contains interrupt information for each entry in
+   interrupt-names.
+- ti,ssi-cawake-gpio:  Defines which GPIO pin is used to signify CAWAKE
+   events for the port. This is an optional board-specific
+   property. If it's missing the port will not be
+   enabled.
+
+Example for Nokia N900:
+
+ssi-controller@48058000 {
+   compatible = "ti,omap3-ssi";
+
+   /* needed until hwmod is updated to use the compatible string */
+   ti,hwmods = "ssi";
+
+   reg = <0x48058000 0x1000>,
+ <0x48059000 0x1000>;
+   reg-names = "sys",
+   "gdd";
+
+   interrupts = <55>;
+   interrupt-names = "gdd_mpu";
+
+   clocks = <_ssr_fck>,
+<_sst_fck>,
+<_ick>;
+   clock-names = "ssi_ssr_fck",
+ "ssi_sst_fck",
+ "ssi_ick";
+
+   #address-cells = <1>;
+   #size-cells = <1>;
+   ranges;
+
+   ssi-port@0 {
+   compatible = "ti,omap3-ssi-port";
+
+   reg = <0x4805a000 0x800>,
+ <0x4805a800 0x800>;
+   reg-names = "tx",
+   "rx";
+
+   interrupt-parent = <>;
+   interrupts = <51>,
+<52>;
+   interrupt-names = "mpu_irq0",
+ "mpu_irq1";
+
+   ti,ssi-cawake-gpio = < 23 GPIO_ACTIVE_HIGH>; /* 151 */
+   }
+}
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCHv2 0/6] OMAP SSI driver

2014-03-09 Thread Sebastian Reichel
Hi,

This is the sixth round of the OMAP SSI driver patches. I think the OMAP SSI
driver is ready for mainline and should be included in 3.15. This round updates
the patchset according to the comments from Mark Rutland and Rob Herring.

Changes since PATCHv1 [0]:
 * add a general description of what HSI is (Documentation/hsi.txt)
 * remove generic HSI client binding for now. I will send a separate
   patchset to discuss the HSI client binding.
 * Replace (*struct->func)(args) by struct->func(args)
 * Replace platform_get_resource_byname by platform_get_irq_byname
 * omap-ssi: only count childs compatible with "ti,omap3-ssi-port"
 * omap-ssi: only populate subdevices compatible with "ti,omap3-ssi-port"

TODO:
* Central Message Queue
  I did not yet implement a central message queue in the HSI framework.
  I will do this after Nokia N900 modem is working in the mainline kernel.
* Remove the hwmod DT hack
  This depends on some future work merging hwmod data into DT.
* Implement proper context loss detection

P.S.: It would be nice if I get some Reviewed-By/Acked-By.

[0] https://lkml.org/lkml/2014/2/23/173

-- Sebastian

Sebastian Reichel (6):
  Documentation: HSI: Add some general description for the HSI subsystem
  HSI: Add function to register HSI clients from DT
  HSI: method to unregister clients from an hsi port
  HSI: hsi-char: fix driver for multiport scenarios
  HSI: Introduce OMAP SSI driver
  Documentation: DT: omap-ssi binding documentation

 Documentation/devicetree/bindings/hsi/omap-ssi.txt |   82 ++
 Documentation/hsi.txt  |   75 ++
 drivers/hsi/Kconfig|1 +
 drivers/hsi/Makefile   |1 +
 drivers/hsi/clients/hsi_char.c |2 +-
 drivers/hsi/controllers/Kconfig|   19 +
 drivers/hsi/controllers/Makefile   |6 +
 drivers/hsi/controllers/omap_ssi.c |  621 +
 drivers/hsi/controllers/omap_ssi.h |  166 +++
 drivers/hsi/controllers/omap_ssi_port.c| 1401 
 drivers/hsi/controllers/omap_ssi_regs.h|  171 +++
 drivers/hsi/hsi.c  |   42 +-
 include/linux/hsi/hsi.h|3 +
 13 files changed, 2588 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/hsi/omap-ssi.txt
 create mode 100644 Documentation/hsi.txt
 create mode 100644 drivers/hsi/controllers/Kconfig
 create mode 100644 drivers/hsi/controllers/Makefile
 create mode 100644 drivers/hsi/controllers/omap_ssi.c
 create mode 100644 drivers/hsi/controllers/omap_ssi.h
 create mode 100644 drivers/hsi/controllers/omap_ssi_port.c
 create mode 100644 drivers/hsi/controllers/omap_ssi_regs.h

-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH net-next] hyperv: Change the receive buffer size for legacy hosts

2014-03-09 Thread Haiyang Zhang
Due to a bug in the Hyper-V host verion 2008R2, we need to use a slightly 
smaller
receive buffer size, otherwise the buffer will not be accepted by the legacy 
hosts.

Signed-off-by: Haiyang Zhang 
---
 drivers/net/hyperv/hyperv_net.h |1 +
 drivers/net/hyperv/netvsc.c |6 +-
 2 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h
index 39fc230..ea5f182 100644
--- a/drivers/net/hyperv/hyperv_net.h
+++ b/drivers/net/hyperv/hyperv_net.h
@@ -516,6 +516,7 @@ struct nvsp_message {
 #define NETVSC_MTU 65536
 
 #define NETVSC_RECEIVE_BUFFER_SIZE (1024*1024*16)  /* 16MB */
+#define NETVSC_RECEIVE_BUFFER_SIZE_LEGACY  (1024*1024*15)  /* 15MB */
 
 #define NETVSC_RECEIVE_BUFFER_ID   0xcafe
 
diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index 1a0280d..daddea2 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -365,6 +365,11 @@ static int netvsc_connect_vsp(struct hv_device *device)
goto cleanup;
 
/* Post the big receive buffer to NetVSP */
+   if (net_device->nvsp_version <= NVSP_PROTOCOL_VERSION_2)
+   net_device->recv_buf_size = NETVSC_RECEIVE_BUFFER_SIZE_LEGACY;
+   else
+   net_device->recv_buf_size = NETVSC_RECEIVE_BUFFER_SIZE;
+
ret = netvsc_init_recv_buf(device);
 
 cleanup:
@@ -898,7 +903,6 @@ int netvsc_device_add(struct hv_device *device, void 
*additional_info)
ndev = net_device->ndev;
 
/* Initialize the NetVSC channel extension */
-   net_device->recv_buf_size = NETVSC_RECEIVE_BUFFER_SIZE;
spin_lock_init(_device->recv_pkt_list_lock);
 
INIT_LIST_HEAD(_device->recv_pkt_list);
-- 
1.7.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] mm: Changed pr_warning() to pr_warn()

2014-03-09 Thread Joe Perches
On Sun, 2014-03-09 at 17:15 +0900, Choi Gi-yong wrote:
> Signed-off-by: Choi Gi-yong 

[]

> diff --git a/mm/percpu.c b/mm/percpu.c

> @@ -812,8 +812,8 @@ fail_unlock:
>  fail_unlock_mutex:
>   mutex_unlock(_alloc_mutex);
>   if (warn_limit) {
> - pr_warn("PERCPU: allocation failed, size=%zu align=%zu, "
> -"%s\n", size, align, err);
> + pr_warn("PERCPU: allocation failed, size=%zu align=%zu, %s\n",
> + size, align, err);

The second line should use 3 tabs for indentation.

pr_warn("PERCPU: allocation failed, size=%zu align=%zu, %s\n",
size, align, err);

If you want to become familiar with kernel style
and patching something, please practice on some
files in drivers/staging.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Hello

2014-03-09 Thread jeniffer14_21

Hello,
Good day.

MY name is Jeniffer,(jenifferwillso...@yahoo.com)i am a young girl,
with full of understanding,caring,I came across your contact i found interest 
in you,i believe you will be a good caring person.
I will be happy if we get to know each other,i will send you my photo once i 
receive your email and everything about me,
on that contact me directly to my email at  (jenifferwillso...@yahoo.com)
for easiest communication or you send me your email.Have a lovely and 
wonderful moment as I'm hoping to hear from you.

Best Regards
Jeniffer






--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] MN10300: Fix typo "CONFIG_GEENERIC_CLOCKEVENTS_BROADCAST"

2014-03-09 Thread Paul Bolle
Signed-off-by: Paul Bolle 
---
This typo has been in the tree since v2.6.37. Perhaps the negative test
for CONFIG_GENERIC_CLOCKEVENTS_BROADCAST can actually be dropped.
Anyhow, completely untested.

 arch/mn10300/kernel/cevt-mn10300.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/mn10300/kernel/cevt-mn10300.c 
b/arch/mn10300/kernel/cevt-mn10300.c
index ccce35e..504ef6a 100644
--- a/arch/mn10300/kernel/cevt-mn10300.c
+++ b/arch/mn10300/kernel/cevt-mn10300.c
@@ -16,7 +16,7 @@
 #include "internal.h"
 
 #ifdef CONFIG_SMP
-#if (CONFIG_NR_CPUS > 2) && !defined(CONFIG_GEENERIC_CLOCKEVENTS_BROADCAST)
+#if (CONFIG_NR_CPUS > 2) && !defined(CONFIG_GENERIC_CLOCKEVENTS_BROADCAST)
 #error "This doesn't scale well! Need per-core local timers."
 #endif
 #else /* CONFIG_SMP */
-- 
1.8.5.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 net-next 1/3] filter: add Extended BPF interpreter and converter

2014-03-09 Thread Daniel Borkmann

On 03/09/2014 06:08 PM, Alexei Starovoitov wrote:

On Sun, Mar 9, 2014 at 5:29 AM, Daniel Borkmann  wrote:

On 03/09/2014 12:15 AM, Alexei Starovoitov wrote:


Extended BPF extends old BPF in the following ways:
- from 2 to 10 registers
Original BPF has two registers (A and X) and hidden frame pointer.
Extended BPF has ten registers and read-only frame pointer.
- from 32-bit registers to 64-bit registers
semantics of old 32-bit ALU operations are preserved via 32-bit
subregisters
- if (cond) jump_true; else jump_false;
old BPF insns are replaced with:
if (cond) jump_true; /* else fallthrough */
- adds signed > and >= insns
- 16 4-byte stack slots for register spill-fill replaced with
up to 512 bytes of multi-use stack space
- introduces bpf_call insn and register passing convention for zero
overhead calls from/to other kernel functions (not part of this patch)
- adds arithmetic right shift insn
- adds swab32/swab64 insns
- adds atomic_add insn
- old tax/txa insns are replaced with 'mov dst,src' insn

Extended BPF is designed to be JITed with one to one mapping, which
allows GCC/LLVM backends to generate optimized BPF code that performs
almost as fast as natively compiled code

sk_convert_filter() remaps old style insns into extended:
'sock_filter' instructions are remapped on the fly to
'sock_filter_ext' extended instructions when
sysctl net.core.bpf_ext_enable=1

Old filter comes through sk_attach_filter() or
sk_unattached_filter_create()
   if (bpf_ext_enable) {
  convert to new
  sk_chk_filter() - check old bpf
  use sk_run_filter_ext() - new interpreter
   } else {
  sk_chk_filter() - check old bpf
  if (bpf_jit_enable)
  use old jit
  else
  use sk_run_filter() - old interpreter
   }

sk_run_filter_ext() interpreter is noticeably faster
than sk_run_filter() for two reasons:

1.fall-through jumps
Old BPF jump instructions are forced to go either 'true' or 'false'
branch which causes branch-miss penalty.
Extended BPF jump instructions have one branch and fall-through,
which fit CPU branch predictor logic better.
'perf stat' shows drastic difference for branch-misses.

2.jump-threaded implementation of interpreter vs switch statement
Instead of single tablejump at the top of 'switch' statement, GCC will
generate multiple tablejump instructions, which helps CPU branch
predictor

Performance of two BPF filters generated by libpcap was measured
on x86_64, i386 and arm32.

fprog #1 is taken from Documentation/networking/filter.txt:
tcpdump -i eth0 port 22 -dd

fprog #2 is taken from 'man tcpdump':
tcpdump -i eth0 'tcp port 22 and (((ip[2:2] - ((ip[0]&0xf)<<2)) -
 ((tcp[12]&0xf0)>>2)) != 0)' -dd

Other libpcap programs have similar performance differences.

Raw performance data from BPF micro-benchmark:
SK_RUN_FILTER on same SKB (cache-hit) or 10k SKBs (cache-miss)
time in nsec per call, smaller is better
--x86_64--
   fprog #1  fprog #1   fprog #2  fprog #2
   cache-hit cache-miss cache-hit cache-miss
old BPF 90   101   192   202
ext BPF 3171   47 97
old BPF jit 1234   17 44
ext BPF jit TBD

--i386--
   fprog #1  fprog #1   fprog #2  fprog #2
   cache-hit cache-miss cache-hit cache-miss
old BPF107136  227   252
ext BPF 40119   69   172

--arm32--
   fprog #1  fprog #1   fprog #2  fprog #2
   cache-hit cache-miss cache-hit cache-miss
old BPF202300  475   540
ext BPF180270  330   470
old BPF jit 26182   37   202
new BPF jit TBD

Tested with trinify BPF fuzzer

Future work:

0. add bpf/ebpf testsuite to tools/testing/selftests/net/bpf

1. add extended BPF JIT for x86_64

2. add inband old/new demux and extended BPF verifier, so that new
programs
 can be loaded through old sk_attach_filter() and
sk_unattached_filter_create()
 interfaces

3. tracing filters systemtap-like with extended BPF

4. OVS with extended BPF

5. nftables with extended BPF

Signed-off-by: Alexei Starovoitov 
Acked-by: Hagen Paul Pfeifer 
Reviewed-by: Daniel Borkmann 



One more question or possible issue that came through my mind: When
someone attaches a socket filter from user space, and bpf_ext_enable=1
then the old filter will transparently be converted to the new
representation. If then user space (e.g. through checkpoint restore)
will issue a sk_get_filter() and thus we're calling sk_decode_filter()
on sk->sk_filter and, therefore, try to decode what we stored in
insns_ext[] with the assumption we still have the old code. Would that
actually crash (or leak memory, or just return garbage), as we access
decodes[] array with filt->code? Would be great if you could double-check.


ohh. yes. missed that.
when bpf_ext_enable=1 I think it's cleaner to return ebpf filter.
This way the user space can see how old bpf filter was 

  1   2   3   4   5   >