from:"Robert Foss"

Re: [PATCH v3 5/5] net: asix: autoneg will set WRITE_MEDIUM reg

2016-09-06 Thread Robert Foss




On 2016-09-06 12:41 PM, Grant Grundler wrote:

On Thu, Sep 1, 2016 at 10:02 AM, Eric Dumazet <eric.duma...@gmail.com> wrote:

On Thu, 2016-09-01 at 12:47 -0400, Robert Foss wrote:


I'm not quite sure how the first From line was added, it
should not have been.
Grant Grundler is most definitely the author.

Would you like me to resubmit in v++ and make sure that it has been
corrected?


This is too late, patches are already merged in David Miller net-next
tree.

These kinds of errors can not be fixed, we have to be very careful at
submission/review time.

I guess Grant does not care, but some contributors, especially new ones
would like to get proper attribution.


I do not mind. Robert will get email about bugs instead of me. :D


Thanks Grant, sorry about the mixup!


Rob.

Re: [PATCH v3 5/5] net: asix: autoneg will set WRITE_MEDIUM reg

2016-09-06 Thread Robert Foss




On 2016-09-06 12:41 PM, Grant Grundler wrote:

On Thu, Sep 1, 2016 at 10:02 AM, Eric Dumazet  wrote:

On Thu, 2016-09-01 at 12:47 -0400, Robert Foss wrote:


I'm not quite sure how the first From line was added, it
should not have been.
Grant Grundler is most definitely the author.

Would you like me to resubmit in v++ and make sure that it has been
corrected?


This is too late, patches are already merged in David Miller net-next
tree.

These kinds of errors can not be fixed, we have to be very careful at
submission/review time.

I guess Grant does not care, but some contributors, especially new ones
would like to get proper attribution.


I do not mind. Robert will get email about bugs instead of me. :D


Thanks Grant, sorry about the mixup!


Rob.

[PATCH v5 3/3] Documentation/filesystems: Added /proc/PID/totmaps documentation

2016-09-05 Thread robert . foss

From: Robert Foss <robert.f...@collabora.com>

Added documentation covering /proc/PID/totmaps.

Signed-off-by: Robert Foss <robert.f...@collabora.com>
---
 Documentation/filesystems/proc.txt | 21 +
 1 file changed, 21 insertions(+)

diff --git a/Documentation/filesystems/proc.txt 
b/Documentation/filesystems/proc.txt
index fcc1ac0..49a8483 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -11,6 +11,7 @@ Version 1.3  
Kernel version 2.2.12
  Kernel version 2.4.0-test11-pre4
 --
 fixes/update part 1.1  Stefani Seibold <stef...@seibold.net>   June 9 2009
+add totmaps    Robert Foss <robert.f...@collabora.com>  August 12 2016
 
 Table of Contents
 -
@@ -147,6 +148,8 @@ Table 1-1: Process specific entries in /proc
  stack Report full stack trace, enable via CONFIG_STACKTRACE
  smaps an extension based on maps, showing the memory consumption of
each mapping and flags associated with it
+ totmapsan extension based on maps, showing the total memory
+consumption of all mappings
  numa_maps an extension based on maps, showing the memory locality and
binding policy as well as mem usage (in pages) of each mapping.
 ..
@@ -515,6 +518,24 @@ be vanished or the reverse -- new added.
 This file is only present if the CONFIG_MMU kernel configuration option is
 enabled.
 
+The /proc/PID/totmaps is an extension based on maps, showing the memory
+consumption totals for all of the process's mappings. It lists the sums of the
+same statistics as /proc/PID/smaps.
+
+The process' mappings will be summarized as a series of lines like the
+following:
+
+Rss:4256 kB
+Pss:1170 kB
+Shared_Clean:   2720 kB
+Shared_Dirty:   1136 kB
+Private_Clean: 0 kB
+Private_Dirty:   400 kB
+Referenced: 4256 kB
+Anonymous:  1536 kB
+AnonHugePages: 0 kB
+Swap:  0 kB
+
 The /proc/PID/clear_refs is used to reset the PG_Referenced and ACCESSED/YOUNG
 bits on both physical and virtual pages associated with a process, and the
 soft-dirty bit on pte (see Documentation/vm/soft-dirty.txt for details).
-- 
2.7.4

[PATCH v5 2/3] Documentation/filesystems: Fixed typo

2016-09-05 Thread robert . foss

From: Robert Foss <robert.f...@collabora.com>

Fixed a -> an typo.

Signed-off-by: Robert Foss <robert.f...@collabora.com>
---
 Documentation/filesystems/proc.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/filesystems/proc.txt 
b/Documentation/filesystems/proc.txt
index 68080ad..fcc1ac0 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -145,7 +145,7 @@ Table 1-1: Process specific entries in /proc
symbol the task is blocked in - or "0" if not blocked.
  pagemap   Page table
  stack Report full stack trace, enable via CONFIG_STACKTRACE
- smaps a extension based on maps, showing the memory consumption of
+ smaps an extension based on maps, showing the memory consumption of
each mapping and flags associated with it
  numa_maps an extension based on maps, showing the memory locality and
binding policy as well as mem usage (in pages) of each mapping.
-- 
2.7.4

[PATCH v5 2/3] Documentation/filesystems: Fixed typo

2016-09-05 Thread robert . foss

From: Robert Foss 

Fixed a -> an typo.

Signed-off-by: Robert Foss 
---
 Documentation/filesystems/proc.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/filesystems/proc.txt 
b/Documentation/filesystems/proc.txt
index 68080ad..fcc1ac0 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -145,7 +145,7 @@ Table 1-1: Process specific entries in /proc
symbol the task is blocked in - or "0" if not blocked.
  pagemap   Page table
  stack Report full stack trace, enable via CONFIG_STACKTRACE
- smaps a extension based on maps, showing the memory consumption of
+ smaps an extension based on maps, showing the memory consumption of
each mapping and flags associated with it
  numa_maps an extension based on maps, showing the memory locality and
binding policy as well as mem usage (in pages) of each mapping.
-- 
2.7.4

[PATCH v5 3/3] Documentation/filesystems: Added /proc/PID/totmaps documentation

2016-09-05 Thread robert . foss

From: Robert Foss 

Added documentation covering /proc/PID/totmaps.

Signed-off-by: Robert Foss 
---
 Documentation/filesystems/proc.txt | 21 +
 1 file changed, 21 insertions(+)

diff --git a/Documentation/filesystems/proc.txt 
b/Documentation/filesystems/proc.txt
index fcc1ac0..49a8483 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -11,6 +11,7 @@ Version 1.3  
Kernel version 2.2.12
  Kernel version 2.4.0-test11-pre4
 --
 fixes/update part 1.1  Stefani SeiboldJune 9 2009
+add totmapsRobert Foss   August 12 2016
 
 Table of Contents
 -
@@ -147,6 +148,8 @@ Table 1-1: Process specific entries in /proc
  stack Report full stack trace, enable via CONFIG_STACKTRACE
  smaps an extension based on maps, showing the memory consumption of
each mapping and flags associated with it
+ totmapsan extension based on maps, showing the total memory
+consumption of all mappings
  numa_maps an extension based on maps, showing the memory locality and
binding policy as well as mem usage (in pages) of each mapping.
 ..
@@ -515,6 +518,24 @@ be vanished or the reverse -- new added.
 This file is only present if the CONFIG_MMU kernel configuration option is
 enabled.
 
+The /proc/PID/totmaps is an extension based on maps, showing the memory
+consumption totals for all of the process's mappings. It lists the sums of the
+same statistics as /proc/PID/smaps.
+
+The process' mappings will be summarized as a series of lines like the
+following:
+
+Rss:4256 kB
+Pss:1170 kB
+Shared_Clean:   2720 kB
+Shared_Dirty:   1136 kB
+Private_Clean: 0 kB
+Private_Dirty:   400 kB
+Referenced: 4256 kB
+Anonymous:  1536 kB
+AnonHugePages: 0 kB
+Swap:  0 kB
+
 The /proc/PID/clear_refs is used to reset the PG_Referenced and ACCESSED/YOUNG
 bits on both physical and virtual pages associated with a process, and the
 soft-dirty bit on pte (see Documentation/vm/soft-dirty.txt for details).
-- 
2.7.4

[PATCH v5 1/3] mm, proc: Implement /proc//totmaps

2016-09-05 Thread robert . foss

From: Robert Foss <robert.f...@collabora.com>

This is based on earlier work by Thiago Goncales. It implements a new
per process proc file which summarizes the contents of the smaps file
but doesn't display any addresses.  It gives more detailed information
than statm like the PSS (proprotional set size).  It differs from the
original implementation in that it doesn't use the full blown set of
seq operations, uses a different termination condition, and doesn't
displayed "Locked" as that was broken on the original implemenation.

This new proc file provides information faster than parsing the potentially
huge smaps file.

Tested-by: Robert Foss <robert.f...@collabora.com>
Signed-off-by: Robert Foss <robert.f...@collabora.com>

Signed-off-by: Sonny Rao <sonny...@chromium.org>
---
 fs/proc/base.c |   1 +
 fs/proc/internal.h |   2 +
 fs/proc/task_mmu.c | 148 +
 3 files changed, 151 insertions(+)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index ac0df4d..dc7e81b7 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2854,6 +2854,7 @@ static const struct pid_entry tgid_base_stuff[] = {
REG("clear_refs", S_IWUSR, proc_clear_refs_operations),
REG("smaps",  S_IRUGO, proc_pid_smaps_operations),
REG("pagemap",S_IRUSR, proc_pagemap_operations),
+   REG("totmaps",S_IRUGO, proc_totmaps_operations),
 #endif
 #ifdef CONFIG_SECURITY
DIR("attr",   S_IRUGO|S_IXUGO, proc_attr_dir_inode_operations, 
proc_attr_dir_operations),
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index 7931c55..3bdafe8 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -298,6 +298,8 @@ extern const struct file_operations 
proc_pid_smaps_operations;
 extern const struct file_operations proc_tid_smaps_operations;
 extern const struct file_operations proc_clear_refs_operations;
 extern const struct file_operations proc_pagemap_operations;
+extern const struct file_operations proc_totmaps_operations;
+
 
 extern unsigned long task_vsize(struct mm_struct *);
 extern unsigned long task_statm(struct mm_struct *,
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 187d84e..f0f4fee 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -810,6 +810,75 @@ static int show_smap(struct seq_file *m, void *v, int 
is_pid)
return 0;
 }
 
+static void add_smaps_sum(struct mem_size_stats *mss,
+   struct mem_size_stats *mss_sum)
+{
+   mss_sum->resident += mss->resident;
+   mss_sum->pss += mss->pss;
+   mss_sum->shared_clean += mss->shared_clean;
+   mss_sum->shared_dirty += mss->shared_dirty;
+   mss_sum->private_clean += mss->private_clean;
+   mss_sum->private_dirty += mss->private_dirty;
+   mss_sum->referenced += mss->referenced;
+   mss_sum->anonymous += mss->anonymous;
+   mss_sum->anonymous_thp += mss->anonymous_thp;
+   mss_sum->swap += mss->swap;
+}
+
+static int totmaps_proc_show(struct seq_file *m, void *data)
+{
+   struct proc_maps_private *priv = m->private;
+   struct mm_struct *mm = priv->mm;
+   struct vm_area_struct *vma;
+   struct mem_size_stats mss_sum;
+
+   memset(_sum, 0, sizeof(mss_sum));
+   down_read(>mmap_sem);
+   hold_task_mempolicy(priv);
+
+   for (vma = mm->mmap; vma != priv->tail_vma; vma = vma->vm_next) {
+   struct mem_size_stats mss;
+   struct mm_walk smaps_walk = {
+   .pmd_entry = smaps_pte_range,
+   .mm = vma->vm_mm,
+   .private = ,
+   };
+
+   if (vma->vm_mm && !is_vm_hugetlb_page(vma)) {
+   memset(, 0, sizeof(mss));
+   walk_page_vma(vma, _walk);
+   add_smaps_sum(, _sum);
+   }
+   }
+
+   release_task_mempolicy(priv);
+   up_read(>mmap_sem);
+
+   seq_printf(m,
+  "Rss:%8lu kB\n"
+  "Pss:%8lu kB\n"
+  "Shared_Clean:   %8lu kB\n"
+  "Shared_Dirty:   %8lu kB\n"
+  "Private_Clean:  %8lu kB\n"
+  "Private_Dirty:  %8lu kB\n"
+  "Referenced: %8lu kB\n"
+  "Anonymous:  %8lu kB\n"
+  "AnonHugePages:  %8lu kB\n"
+  "Swap:   %8lu kB\n",
+  mss_sum.resident >> 10,
+  (unsigned long)(mss_sum.pss >> (10 + PSS_SHIFT)),
+  mss_sum.shared_clean  >> 10,
+  mss_sum.shared_dirty  >> 10,
+  mss_sum.private_clean >> 10,
+  mss_su

[PATCH v5 1/3] mm, proc: Implement /proc//totmaps

2016-09-05 Thread robert . foss

From: Robert Foss 

This is based on earlier work by Thiago Goncales. It implements a new
per process proc file which summarizes the contents of the smaps file
but doesn't display any addresses.  It gives more detailed information
than statm like the PSS (proprotional set size).  It differs from the
original implementation in that it doesn't use the full blown set of
seq operations, uses a different termination condition, and doesn't
displayed "Locked" as that was broken on the original implemenation.

This new proc file provides information faster than parsing the potentially
huge smaps file.

Tested-by: Robert Foss 
Signed-off-by: Robert Foss 

Signed-off-by: Sonny Rao 
---
 fs/proc/base.c |   1 +
 fs/proc/internal.h |   2 +
 fs/proc/task_mmu.c | 148 +
 3 files changed, 151 insertions(+)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index ac0df4d..dc7e81b7 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2854,6 +2854,7 @@ static const struct pid_entry tgid_base_stuff[] = {
REG("clear_refs", S_IWUSR, proc_clear_refs_operations),
REG("smaps",  S_IRUGO, proc_pid_smaps_operations),
REG("pagemap",S_IRUSR, proc_pagemap_operations),
+   REG("totmaps",S_IRUGO, proc_totmaps_operations),
 #endif
 #ifdef CONFIG_SECURITY
DIR("attr",   S_IRUGO|S_IXUGO, proc_attr_dir_inode_operations, 
proc_attr_dir_operations),
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index 7931c55..3bdafe8 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -298,6 +298,8 @@ extern const struct file_operations 
proc_pid_smaps_operations;
 extern const struct file_operations proc_tid_smaps_operations;
 extern const struct file_operations proc_clear_refs_operations;
 extern const struct file_operations proc_pagemap_operations;
+extern const struct file_operations proc_totmaps_operations;
+
 
 extern unsigned long task_vsize(struct mm_struct *);
 extern unsigned long task_statm(struct mm_struct *,
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 187d84e..f0f4fee 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -810,6 +810,75 @@ static int show_smap(struct seq_file *m, void *v, int 
is_pid)
return 0;
 }
 
+static void add_smaps_sum(struct mem_size_stats *mss,
+   struct mem_size_stats *mss_sum)
+{
+   mss_sum->resident += mss->resident;
+   mss_sum->pss += mss->pss;
+   mss_sum->shared_clean += mss->shared_clean;
+   mss_sum->shared_dirty += mss->shared_dirty;
+   mss_sum->private_clean += mss->private_clean;
+   mss_sum->private_dirty += mss->private_dirty;
+   mss_sum->referenced += mss->referenced;
+   mss_sum->anonymous += mss->anonymous;
+   mss_sum->anonymous_thp += mss->anonymous_thp;
+   mss_sum->swap += mss->swap;
+}
+
+static int totmaps_proc_show(struct seq_file *m, void *data)
+{
+   struct proc_maps_private *priv = m->private;
+   struct mm_struct *mm = priv->mm;
+   struct vm_area_struct *vma;
+   struct mem_size_stats mss_sum;
+
+   memset(_sum, 0, sizeof(mss_sum));
+   down_read(>mmap_sem);
+   hold_task_mempolicy(priv);
+
+   for (vma = mm->mmap; vma != priv->tail_vma; vma = vma->vm_next) {
+   struct mem_size_stats mss;
+   struct mm_walk smaps_walk = {
+   .pmd_entry = smaps_pte_range,
+   .mm = vma->vm_mm,
+   .private = ,
+   };
+
+   if (vma->vm_mm && !is_vm_hugetlb_page(vma)) {
+   memset(, 0, sizeof(mss));
+   walk_page_vma(vma, _walk);
+   add_smaps_sum(, _sum);
+   }
+   }
+
+   release_task_mempolicy(priv);
+   up_read(>mmap_sem);
+
+   seq_printf(m,
+  "Rss:%8lu kB\n"
+  "Pss:%8lu kB\n"
+  "Shared_Clean:   %8lu kB\n"
+  "Shared_Dirty:   %8lu kB\n"
+  "Private_Clean:  %8lu kB\n"
+  "Private_Dirty:  %8lu kB\n"
+  "Referenced: %8lu kB\n"
+  "Anonymous:  %8lu kB\n"
+  "AnonHugePages:  %8lu kB\n"
+  "Swap:   %8lu kB\n",
+  mss_sum.resident >> 10,
+  (unsigned long)(mss_sum.pss >> (10 + PSS_SHIFT)),
+  mss_sum.shared_clean  >> 10,
+  mss_sum.shared_dirty  >> 10,
+  mss_sum.private_clean >> 10,
+  mss_sum.private_dirty >> 10,
+  mss_sum.referenced >> 10,
+  mss_sum.anonymous >&

[PATCH v5 0/3] mm, proc: Implement /proc//totmaps

2016-09-05 Thread robert . foss

From: Robert Foss <robert.f...@collabora.com>

This series provides the /proc/PID/totmaps feature, which
summarizes the information provided by /proc/PID/smaps for
improved performance and usability reasons.

A use case is to speed up monitoring of memory consumption in
environments where RSS isn't precise.

For example Chrome tends to many processes which have hundreds of VMAs
with a substantial amount of shared memory, and the error of using
RSS rather than PSS tends to be very large when looking at overall
memory consumption.  PSS isn't kept as a single number that's exported
like RSS, so to calculate PSS means having to parse a very large smaps
file.

This process is slow and has to be repeated for many processes, and we
found that the just act of doing the parsing was taking up a
significant amount of CPU time, so this patch is an attempt to make
that process cheaper.

/proc/PID/totmaps provides roughly a 2x speedup compared to parsing
/proc/PID/smaps with awk.

$ ps aux | grep firefox
robertfoss   5025 24.3 13.7 3562820 2219616 ? Rl   Aug15 277:44 
/usr/lib/firefox/firefox https://allg.one/xpb
$ awk '/^[0-9a-f]/{print}' /proc/5025/smaps | wc -l
1503
$ /usr/bin/time -v -p zsh -c "(repeat 25 {cat /proc/5025/totmaps})"
[...]
Command being timed: "zsh -c (repeat 25 {cat /proc/5025/totmaps})"
User time (seconds): 0.00
System time (seconds): 0.40
Percent of CPU this job got: 90%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.45

$ /usr/bin/time -v -p zsh -c "repeat 25 { awk '/^Rss/{rss+=\$2} 
/^Pss/{pss+=\$2} END {printf \"rss:%d pss:%d\n\", rss, pss}\' /proc/5025/smaps 
}"
[...]
Command being timed: "zsh -c repeat 25 { awk '/^Rss/{rss+=$2} 
/^Pss/{pss+=$2} END {printf "rss:%d pss:%d\n", rss, pss}\' /proc/5025/smaps }"
User time (seconds): 0.37
System time (seconds): 0.45
Percent of CPU this job got: 92%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.89


Changes since v1:
- Removed IS_ERR check from get_task_mm() function
- Changed comment format
- Moved proc_totmaps_operations declaration inside internal.h
- Switched to using do_maps_open() in totmaps_open() function,
  which provides privilege checking
- Error handling reworked for totmaps_open() function
- Switched to stack allocated struct mem_size_stats mss_sum in
  totmaps_proc_show() function
- Removed get_task_mm() in totmaps_proc_show() since priv->mm
  already is available
- Added support to proc_map_release() fork priv==NULL, to allow
  function to be used for all failure cases
- Added proc_totmaps_op and for it helper functions
- Added documention in separate patch
- Removed totmaps_release() since it was just a wrapper for proc_map_release()

Changes since v2:
- Removed struct mem_size_stats *mss from struct proc_maps_private
- Removed priv->task assignment in totmaps_open() call
- Moved some assignements calls totmaps_open() around to increase code
  clarity
- Moved some function calls to unlock data structures before printing

Changes since v3:
- Fixed typo in totmaps documentation
- Fixed issue where proc_map_release wasn't called on error
- Fixed put_task_struct not being called during .release()

Changes since v4:
- Prevent access to invalid processes

Robert Foss (3):
  mm, proc: Implement /proc//totmaps
  Documentation/filesystems: Fixed typo
  Documentation/filesystems: Added /proc/PID/totmaps documentation

 Documentation/filesystems/proc.txt |  23 +-
 fs/proc/base.c |   1 +
 fs/proc/internal.h |   2 +
 fs/proc/task_mmu.c | 148 +
 4 files changed, 173 insertions(+), 1 deletion(-)

-- 
2.7.4

[PATCH v5 0/3] mm, proc: Implement /proc//totmaps

2016-09-05 Thread robert . foss

From: Robert Foss 

This series provides the /proc/PID/totmaps feature, which
summarizes the information provided by /proc/PID/smaps for
improved performance and usability reasons.

A use case is to speed up monitoring of memory consumption in
environments where RSS isn't precise.

For example Chrome tends to many processes which have hundreds of VMAs
with a substantial amount of shared memory, and the error of using
RSS rather than PSS tends to be very large when looking at overall
memory consumption.  PSS isn't kept as a single number that's exported
like RSS, so to calculate PSS means having to parse a very large smaps
file.

This process is slow and has to be repeated for many processes, and we
found that the just act of doing the parsing was taking up a
significant amount of CPU time, so this patch is an attempt to make
that process cheaper.

/proc/PID/totmaps provides roughly a 2x speedup compared to parsing
/proc/PID/smaps with awk.

$ ps aux | grep firefox
robertfoss   5025 24.3 13.7 3562820 2219616 ? Rl   Aug15 277:44 
/usr/lib/firefox/firefox https://allg.one/xpb
$ awk '/^[0-9a-f]/{print}' /proc/5025/smaps | wc -l
1503
$ /usr/bin/time -v -p zsh -c "(repeat 25 {cat /proc/5025/totmaps})"
[...]
Command being timed: "zsh -c (repeat 25 {cat /proc/5025/totmaps})"
User time (seconds): 0.00
System time (seconds): 0.40
Percent of CPU this job got: 90%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.45

$ /usr/bin/time -v -p zsh -c "repeat 25 { awk '/^Rss/{rss+=\$2} 
/^Pss/{pss+=\$2} END {printf \"rss:%d pss:%d\n\", rss, pss}\' /proc/5025/smaps 
}"
[...]
Command being timed: "zsh -c repeat 25 { awk '/^Rss/{rss+=$2} 
/^Pss/{pss+=$2} END {printf "rss:%d pss:%d\n", rss, pss}\' /proc/5025/smaps }"
User time (seconds): 0.37
System time (seconds): 0.45
Percent of CPU this job got: 92%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.89


Changes since v1:
- Removed IS_ERR check from get_task_mm() function
- Changed comment format
- Moved proc_totmaps_operations declaration inside internal.h
- Switched to using do_maps_open() in totmaps_open() function,
  which provides privilege checking
- Error handling reworked for totmaps_open() function
- Switched to stack allocated struct mem_size_stats mss_sum in
  totmaps_proc_show() function
- Removed get_task_mm() in totmaps_proc_show() since priv->mm
  already is available
- Added support to proc_map_release() fork priv==NULL, to allow
  function to be used for all failure cases
- Added proc_totmaps_op and for it helper functions
- Added documention in separate patch
- Removed totmaps_release() since it was just a wrapper for proc_map_release()

Changes since v2:
- Removed struct mem_size_stats *mss from struct proc_maps_private
- Removed priv->task assignment in totmaps_open() call
- Moved some assignements calls totmaps_open() around to increase code
  clarity
- Moved some function calls to unlock data structures before printing

Changes since v3:
- Fixed typo in totmaps documentation
- Fixed issue where proc_map_release wasn't called on error
- Fixed put_task_struct not being called during .release()

Changes since v4:
- Prevent access to invalid processes

Robert Foss (3):
  mm, proc: Implement /proc//totmaps
  Documentation/filesystems: Fixed typo
  Documentation/filesystems: Added /proc/PID/totmaps documentation

 Documentation/filesystems/proc.txt |  23 +-
 fs/proc/base.c |   1 +
 fs/proc/internal.h |   2 +
 fs/proc/task_mmu.c | 148 +
 4 files changed, 173 insertions(+), 1 deletion(-)

-- 
2.7.4

Re: [PACTH v4 1/3] mm, proc: Implement /proc//totmaps

2016-09-01 Thread Robert Foss




On 2016-08-31 01:04 PM, Mateusz Guzik wrote:

On Wed, Aug 31, 2016 at 12:36:26PM -0400, Robert Foss wrote:

On 2016-08-31 05:45 AM, Jacek Anaszewski wrote:

+static void *m_totmaps_start(struct seq_file *p, loff_t *pos)
+{
+return NULL + (*pos == 0);
+}
+
+static void *m_totmaps_next(struct seq_file *p, void *v, loff_t *pos)
+{
+++*pos;
+return NULL;
+}
+


When reading totmaps of kernel processes the following NULL pointer
dereference occurs:

Unable to handle kernel NULL pointer dereference at virtual address
0044
[] (down_read) from [] (totmaps_proc_show+0x2c/0x1e8)
[] (totmaps_proc_show) from [] (seq_read+0x1c8/0x4b8)
[] (seq_read) from [] (__vfs_read+0x2c/0x110)
[] (__vfs_read) from [] (vfs_read+0x8c/0x110)
[] (vfs_read) from [] (SyS_read+0x40/0x8c)
[] (SyS_read) from [] (ret_fast_syscall+0x0/0x3c)

It seems that some protection is needed for such processes, so that
totmaps would return empty string then, like in case of smaps.



Thanks for the testing Jacek!

I had a look around the corresponding smaps code, but I'm not seeing any
checks, do you know where that check actually is made?



See m_start in f/sproc/task_mmu.c. It not only check for non-null mm,
but also tries to bump ->mm_users and only then proceeds to walk the mm.


So a m_totmaps_start that looks something like the below would be 
enough? And if so, would mm->mm_users need to be decrement inside of

m_totmaps_start?

static void *m_totmaps_start(struct seq_file *p, loff_t *pos)
{
struct proc_maps_private *priv = m->private;
struct mm_struct *mm;

mm = priv->mm;
if (!mm || !atomic_inc_not_zero(>mm_users))
return NULL;

return NULL + (*pos == 0);
}

Re: [PACTH v4 1/3] mm, proc: Implement /proc//totmaps

2016-09-01 Thread Robert Foss




On 2016-08-31 01:04 PM, Mateusz Guzik wrote:

On Wed, Aug 31, 2016 at 12:36:26PM -0400, Robert Foss wrote:

On 2016-08-31 05:45 AM, Jacek Anaszewski wrote:

+static void *m_totmaps_start(struct seq_file *p, loff_t *pos)
+{
+return NULL + (*pos == 0);
+}
+
+static void *m_totmaps_next(struct seq_file *p, void *v, loff_t *pos)
+{
+++*pos;
+return NULL;
+}
+


When reading totmaps of kernel processes the following NULL pointer
dereference occurs:

Unable to handle kernel NULL pointer dereference at virtual address
0044
[] (down_read) from [] (totmaps_proc_show+0x2c/0x1e8)
[] (totmaps_proc_show) from [] (seq_read+0x1c8/0x4b8)
[] (seq_read) from [] (__vfs_read+0x2c/0x110)
[] (__vfs_read) from [] (vfs_read+0x8c/0x110)
[] (vfs_read) from [] (SyS_read+0x40/0x8c)
[] (SyS_read) from [] (ret_fast_syscall+0x0/0x3c)

It seems that some protection is needed for such processes, so that
totmaps would return empty string then, like in case of smaps.



Thanks for the testing Jacek!

I had a look around the corresponding smaps code, but I'm not seeing any
checks, do you know where that check actually is made?



See m_start in f/sproc/task_mmu.c. It not only check for non-null mm,
but also tries to bump ->mm_users and only then proceeds to walk the mm.


So a m_totmaps_start that looks something like the below would be 
enough? And if so, would mm->mm_users need to be decrement inside of

m_totmaps_start?

static void *m_totmaps_start(struct seq_file *p, loff_t *pos)
{
struct proc_maps_private *priv = m->private;
struct mm_struct *mm;

mm = priv->mm;
if (!mm || !atomic_inc_not_zero(>mm_users))
return NULL;

return NULL + (*pos == 0);
}

Re: [PATCH v3 5/5] net: asix: autoneg will set WRITE_MEDIUM reg

2016-09-01 Thread Robert Foss

On 2016-09-01 12:43 PM, Eric Dumazet wrote:

On Mon, 2016-08-29 at 09:32 -0400, robert.f...@collabora.com wrote:

From: Robert Foss <robert.f...@collabora.com>

From: Grant Grundler <grund...@chromium.org>

The miii_nway_restart() causes a PHY link change activity and
ax88772_link_reset will be called. link_reset will set
AX_CMD_WRITE_MEDIUM_MODE register correctly.

The asix_write_medium_mode in reset() fills in a default value to the register
which may be different from the negotiation result. So do this first.

Ignore the ret value since it's ignored in XXX_link_reset() functions.

Signed-off-by: Grant Grundler <grund...@google.com>
Signed-off-by: Robert Foss <robert.f...@collabora.com>
Tested-by: Robert Foss <robert.f...@collabora.com>
---

This is _really_ confusing Robert.

Why having two 'From:' clauses ?

Who wrote the patch in the first place ? You or Grant ?

I'm not quite sure how the first From line was added, it
should not have been.
Grant Grundler is most definitely the author.

Would you like me to resubmit in v++ and make sure that it has been 
corrected?

End result is :

commit 535baf8588d04b177cb33700f81499f2b5203c2d
Author: Robert Foss <robert.f...@collabora.com>
Date:   Mon Aug 29 09:32:19 2016 -0400

net: asix: autoneg will set WRITE_MEDIUM reg

From: Grant Grundler <grund...@chromium.org>

The miii_nway_restart() causes a PHY link change activity and
ax88772_link_reset will be called. link_reset will set
AX_CMD_WRITE_MEDIUM_MODE register correctly.

The asix_write_medium_mode in reset() fills in a default value to the 
register
which may be different from the negotiation result. So do this first.

Ignore the ret value since it's ignored in XXX_link_reset() functions.

Signed-off-by: Grant Grundler <grund...@google.com>
Signed-off-by: Robert Foss <robert.f...@collabora.com>
Tested-by: Robert Foss <robert.f...@collabora.com>
Signed-off-by: David S. Miller <da...@davemloft.net>

I guess Grant wrote the patch, but attribution is wrong.

Re: [PATCH v3 5/5] net: asix: autoneg will set WRITE_MEDIUM reg

2016-09-01 Thread Robert Foss

On 2016-09-01 12:43 PM, Eric Dumazet wrote:

On Mon, 2016-08-29 at 09:32 -0400, robert.f...@collabora.com wrote:

From: Robert Foss 

From: Grant Grundler 

The miii_nway_restart() causes a PHY link change activity and
ax88772_link_reset will be called. link_reset will set
AX_CMD_WRITE_MEDIUM_MODE register correctly.

The asix_write_medium_mode in reset() fills in a default value to the register
which may be different from the negotiation result. So do this first.

Ignore the ret value since it's ignored in XXX_link_reset() functions.

Signed-off-by: Grant Grundler 
Signed-off-by: Robert Foss 
Tested-by: Robert Foss 
---

This is _really_ confusing Robert.

Why having two 'From:' clauses ?

Who wrote the patch in the first place ? You or Grant ?

I'm not quite sure how the first From line was added, it
should not have been.
Grant Grundler is most definitely the author.

Would you like me to resubmit in v++ and make sure that it has been 
corrected?

End result is :

commit 535baf8588d04b177cb33700f81499f2b5203c2d
Author: Robert Foss 
Date:   Mon Aug 29 09:32:19 2016 -0400

net: asix: autoneg will set WRITE_MEDIUM reg

From: Grant Grundler 

The miii_nway_restart() causes a PHY link change activity and
ax88772_link_reset will be called. link_reset will set
AX_CMD_WRITE_MEDIUM_MODE register correctly.

The asix_write_medium_mode in reset() fills in a default value to the 
register
which may be different from the negotiation result. So do this first.

Ignore the ret value since it's ignored in XXX_link_reset() functions.

Signed-off-by: Grant Grundler 
Signed-off-by: Robert Foss 
Tested-by: Robert Foss 
Signed-off-by: David S. Miller 

I guess Grant wrote the patch, but attribution is wrong.

Re: [PATCH v3 0/5] net/usb: asix driver improvements

2016-08-31 Thread Robert Foss


Additional testing has been done the hardware that is available to me.
I'm not seeing any dmesg warnings/errors that are new to this series:

AX88772A
- Pass network_EthernetStressPlug
- Pass phy up/down + iperf3 UDP stress
- Pass network_EthernetStressPlug + iperf3 UDP stress

AX88772B
- Pass network_EthernetStressPlug
- Pass phy up/down + iperf3 UDP stress
- Pass network_EthernetStressPlug + iperf3 UDP stress

Cisco AX88772 A(?)
- Pass network_EthernetStressPlug
- Pass phy up/down + iperf3 UDP stress
- Pass network_EthernetStressPlug + iperf3 UDP stress

Cisco AX88772 B(?)
- Pass network_EthernetStressPlug
- Pass phy up/down + iperf3 UDP stress
- Pass network_EthernetStressPlug + iperf3 UDP stress

AX88178
- Pass network_EthernetStressPlug
- Pass phy up/down + iperf3 UDP stress
- Pass network_EthernetStressPlug + iperf3 UDP stress

AX88179
- Pass network_EthernetStressPlug
- Pass phy up/down + iperf3 UDP stress?
-- [ 8794.555902] ax88179_178a 1-1.3:1.0 eth1: kevent 4 may have been
   dropped
- Pass network_EthernetStressPlug + iperf3 UDP stresss


I also saw some sporadic header checksum errors.
But those too are seen on upstream/master.

asix_rx_fixup() Bad Header Length 0x98a993c8, offset 4


Test details:
network_EthernetStressPlug:
http://memcpy.io/ethernet-device-stress-testing.html

phy up/down:
for i in
{1..10}
do
sudo ifdown eth1
if sudo ifup eth1; then
   echo "Command success"
   else
   echo "Command failed"
   fi
done

iperf3 UDP:
sudo iperf3 -c 192.168.0.28 -u  -b 100M -t 0





On 2016-08-29 09:32 AM, robert.f...@collabora.com wrote:

From: Robert Foss <robert.f...@collabora.com>

This is a resubmission of v3, since the netdev
mailinlist was not sent the previous submission.

This series improves power management of the asix driver.

 - Suspend/resume support is improved to save needed registers.
 - Device disconnection is improved.
 - Fixes AX88772x resume failures
 - Implementes IEEE 802.3 spec section "22.2.4.1.1 Reset" correctly
 - Fixes AX_CMD_WRITE_MEDIUM_MODE being set incorrectly

Changes since v1:
- Added proper metadata tags to series.
- Added two more patches to series.

Changes since v2:
- Added coverletter
- Tested patches on AX88772A/AX88772B/AX88178/AX88179 hardware

Allan Chou (1):
  net: asix: Fix AX88772x resume failures

Freddy Xin (1):
  net: asix: Add in_pm parameter

Grant Grundler (2):
  net: asix: see 802.3 spec for phy reset
  net: asix: autoneg will set WRITE_MEDIUM reg

Vincent Palatin (1):
  net: asix: Avoid looping when the device is disconnected

 drivers/net/usb/asix.h |  40 ++-
 drivers/net/usb/asix_common.c  | 212 
 drivers/net/usb/asix_devices.c | 450 +++---
 drivers/net/usb/ax88172a.c |  29 +-
 4 files changed, 575 insertions(+), 156 deletions(-)

Re: [PATCH v3 0/5] net/usb: asix driver improvements

2016-08-31 Thread Robert Foss


Additional testing has been done the hardware that is available to me.
I'm not seeing any dmesg warnings/errors that are new to this series:

AX88772A
- Pass network_EthernetStressPlug
- Pass phy up/down + iperf3 UDP stress
- Pass network_EthernetStressPlug + iperf3 UDP stress

AX88772B
- Pass network_EthernetStressPlug
- Pass phy up/down + iperf3 UDP stress
- Pass network_EthernetStressPlug + iperf3 UDP stress

Cisco AX88772 A(?)
- Pass network_EthernetStressPlug
- Pass phy up/down + iperf3 UDP stress
- Pass network_EthernetStressPlug + iperf3 UDP stress

Cisco AX88772 B(?)
- Pass network_EthernetStressPlug
- Pass phy up/down + iperf3 UDP stress
- Pass network_EthernetStressPlug + iperf3 UDP stress

AX88178
- Pass network_EthernetStressPlug
- Pass phy up/down + iperf3 UDP stress
- Pass network_EthernetStressPlug + iperf3 UDP stress

AX88179
- Pass network_EthernetStressPlug
- Pass phy up/down + iperf3 UDP stress?
-- [ 8794.555902] ax88179_178a 1-1.3:1.0 eth1: kevent 4 may have been
   dropped
- Pass network_EthernetStressPlug + iperf3 UDP stresss


I also saw some sporadic header checksum errors.
But those too are seen on upstream/master.

asix_rx_fixup() Bad Header Length 0x98a993c8, offset 4


Test details:
network_EthernetStressPlug:
http://memcpy.io/ethernet-device-stress-testing.html

phy up/down:
for i in
{1..10}
do
sudo ifdown eth1
if sudo ifup eth1; then
   echo "Command success"
   else
   echo "Command failed"
   fi
done

iperf3 UDP:
sudo iperf3 -c 192.168.0.28 -u  -b 100M -t 0





On 2016-08-29 09:32 AM, robert.f...@collabora.com wrote:

From: Robert Foss 

This is a resubmission of v3, since the netdev
mailinlist was not sent the previous submission.

This series improves power management of the asix driver.

 - Suspend/resume support is improved to save needed registers.
 - Device disconnection is improved.
 - Fixes AX88772x resume failures
 - Implementes IEEE 802.3 spec section "22.2.4.1.1 Reset" correctly
 - Fixes AX_CMD_WRITE_MEDIUM_MODE being set incorrectly

Changes since v1:
- Added proper metadata tags to series.
- Added two more patches to series.

Changes since v2:
- Added coverletter
- Tested patches on AX88772A/AX88772B/AX88178/AX88179 hardware

Allan Chou (1):
  net: asix: Fix AX88772x resume failures

Freddy Xin (1):
  net: asix: Add in_pm parameter

Grant Grundler (2):
  net: asix: see 802.3 spec for phy reset
  net: asix: autoneg will set WRITE_MEDIUM reg

Vincent Palatin (1):
  net: asix: Avoid looping when the device is disconnected

 drivers/net/usb/asix.h |  40 ++-
 drivers/net/usb/asix_common.c  | 212 
 drivers/net/usb/asix_devices.c | 450 +++---
 drivers/net/usb/ax88172a.c |  29 +-
 4 files changed, 575 insertions(+), 156 deletions(-)

Re: [PACTH v4 1/3] mm, proc: Implement /proc//totmaps

2016-08-31 Thread Robert Foss




On 2016-08-31 05:45 AM, Jacek Anaszewski wrote:

Hi Robert,

On 08/17/2016 12:33 AM, robert.f...@collabora.com wrote:

From: Robert Foss <robert.f...@collabora.com>

This is based on earlier work by Thiago Goncales. It implements a new
per process proc file which summarizes the contents of the smaps file
but doesn't display any addresses.  It gives more detailed information
than statm like the PSS (proprotional set size).  It differs from the
original implementation in that it doesn't use the full blown set of
seq operations, uses a different termination condition, and doesn't
displayed "Locked" as that was broken on the original implemenation.

This new proc file provides information faster than parsing the
potentially
huge smaps file.

Tested-by: Robert Foss <robert.f...@collabora.com>
Signed-off-by: Robert Foss <robert.f...@collabora.com>

Signed-off-by: Sonny Rao <sonny...@chromium.org>
---
 fs/proc/base.c |   1 +
 fs/proc/internal.h |   2 +
 fs/proc/task_mmu.c | 141
+
 3 files changed, 144 insertions(+)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index a11eb71..de3acdf 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2855,6 +2855,7 @@ static const struct pid_entry tgid_base_stuff[] = {
 REG("clear_refs", S_IWUSR, proc_clear_refs_operations),
 REG("smaps",  S_IRUGO, proc_pid_smaps_operations),
 REG("pagemap",S_IRUSR, proc_pagemap_operations),
+REG("totmaps",S_IRUGO, proc_totmaps_operations),
 #endif
 #ifdef CONFIG_SECURITY
 DIR("attr",   S_IRUGO|S_IXUGO,
proc_attr_dir_inode_operations, proc_attr_dir_operations),
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index aa27810..99f97d7 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -297,6 +297,8 @@ extern const struct file_operations
proc_pid_smaps_operations;
 extern const struct file_operations proc_tid_smaps_operations;
 extern const struct file_operations proc_clear_refs_operations;
 extern const struct file_operations proc_pagemap_operations;
+extern const struct file_operations proc_totmaps_operations;
+

 extern unsigned long task_vsize(struct mm_struct *);
 extern unsigned long task_statm(struct mm_struct *,
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 4648c7f..fd8fd7f 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -802,6 +802,75 @@ static int show_smap(struct seq_file *m, void *v,
int is_pid)
 return 0;
 }

+static void add_smaps_sum(struct mem_size_stats *mss,
+struct mem_size_stats *mss_sum)
+{
+mss_sum->resident += mss->resident;
+mss_sum->pss += mss->pss;
+mss_sum->shared_clean += mss->shared_clean;
+mss_sum->shared_dirty += mss->shared_dirty;
+mss_sum->private_clean += mss->private_clean;
+mss_sum->private_dirty += mss->private_dirty;
+mss_sum->referenced += mss->referenced;
+mss_sum->anonymous += mss->anonymous;
+mss_sum->anonymous_thp += mss->anonymous_thp;
+mss_sum->swap += mss->swap;
+}
+
+static int totmaps_proc_show(struct seq_file *m, void *data)
+{
+struct proc_maps_private *priv = m->private;
+struct mm_struct *mm = priv->mm;
+struct vm_area_struct *vma;
+struct mem_size_stats mss_sum;
+
+memset(_sum, 0, sizeof(mss_sum));
+down_read(>mmap_sem);
+hold_task_mempolicy(priv);
+
+for (vma = mm->mmap; vma != priv->tail_vma; vma = vma->vm_next) {
+struct mem_size_stats mss;
+struct mm_walk smaps_walk = {
+.pmd_entry = smaps_pte_range,
+.mm = vma->vm_mm,
+.private = ,
+};
+
+if (vma->vm_mm && !is_vm_hugetlb_page(vma)) {
+memset(, 0, sizeof(mss));
+walk_page_vma(vma, _walk);
+add_smaps_sum(, _sum);
+}
+}
+
+release_task_mempolicy(priv);
+up_read(>mmap_sem);
+
+seq_printf(m,
+   "Rss:%8lu kB\n"
+   "Pss:%8lu kB\n"
+   "Shared_Clean:   %8lu kB\n"
+   "Shared_Dirty:   %8lu kB\n"
+   "Private_Clean:  %8lu kB\n"
+   "Private_Dirty:  %8lu kB\n"
+   "Referenced: %8lu kB\n"
+   "Anonymous:  %8lu kB\n"
+   "AnonHugePages:  %8lu kB\n"
+   "Swap:   %8lu kB\n",
+   mss_sum.resident >> 10,
+   (unsigned long)(mss_sum.pss >> (10 + PSS_SHIFT)),
+   mss_sum.shared_clean  >> 10,
+   mss_sum.shared_dirty  >> 10,
+   mss_sum.private_clean >> 10,
+   mss_sum.private_dirty >> 10,
+   mss_sum.referenced >> 10,
+   mss_sum.anonymous >> 10,
+   mss_sum.anonymous_thp >> 10,
+   mss_su

Re: [PACTH v4 1/3] mm, proc: Implement /proc//totmaps

2016-08-31 Thread Robert Foss




On 2016-08-31 05:45 AM, Jacek Anaszewski wrote:

Hi Robert,

On 08/17/2016 12:33 AM, robert.f...@collabora.com wrote:

From: Robert Foss 

This is based on earlier work by Thiago Goncales. It implements a new
per process proc file which summarizes the contents of the smaps file
but doesn't display any addresses.  It gives more detailed information
than statm like the PSS (proprotional set size).  It differs from the
original implementation in that it doesn't use the full blown set of
seq operations, uses a different termination condition, and doesn't
displayed "Locked" as that was broken on the original implemenation.

This new proc file provides information faster than parsing the
potentially
huge smaps file.

Tested-by: Robert Foss 
Signed-off-by: Robert Foss 

Signed-off-by: Sonny Rao 
---
 fs/proc/base.c |   1 +
 fs/proc/internal.h |   2 +
 fs/proc/task_mmu.c | 141
+
 3 files changed, 144 insertions(+)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index a11eb71..de3acdf 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2855,6 +2855,7 @@ static const struct pid_entry tgid_base_stuff[] = {
 REG("clear_refs", S_IWUSR, proc_clear_refs_operations),
 REG("smaps",  S_IRUGO, proc_pid_smaps_operations),
 REG("pagemap",S_IRUSR, proc_pagemap_operations),
+REG("totmaps",S_IRUGO, proc_totmaps_operations),
 #endif
 #ifdef CONFIG_SECURITY
 DIR("attr",   S_IRUGO|S_IXUGO,
proc_attr_dir_inode_operations, proc_attr_dir_operations),
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index aa27810..99f97d7 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -297,6 +297,8 @@ extern const struct file_operations
proc_pid_smaps_operations;
 extern const struct file_operations proc_tid_smaps_operations;
 extern const struct file_operations proc_clear_refs_operations;
 extern const struct file_operations proc_pagemap_operations;
+extern const struct file_operations proc_totmaps_operations;
+

 extern unsigned long task_vsize(struct mm_struct *);
 extern unsigned long task_statm(struct mm_struct *,
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 4648c7f..fd8fd7f 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -802,6 +802,75 @@ static int show_smap(struct seq_file *m, void *v,
int is_pid)
 return 0;
 }

+static void add_smaps_sum(struct mem_size_stats *mss,
+struct mem_size_stats *mss_sum)
+{
+mss_sum->resident += mss->resident;
+mss_sum->pss += mss->pss;
+mss_sum->shared_clean += mss->shared_clean;
+mss_sum->shared_dirty += mss->shared_dirty;
+mss_sum->private_clean += mss->private_clean;
+mss_sum->private_dirty += mss->private_dirty;
+mss_sum->referenced += mss->referenced;
+mss_sum->anonymous += mss->anonymous;
+mss_sum->anonymous_thp += mss->anonymous_thp;
+mss_sum->swap += mss->swap;
+}
+
+static int totmaps_proc_show(struct seq_file *m, void *data)
+{
+struct proc_maps_private *priv = m->private;
+struct mm_struct *mm = priv->mm;
+struct vm_area_struct *vma;
+struct mem_size_stats mss_sum;
+
+memset(_sum, 0, sizeof(mss_sum));
+down_read(>mmap_sem);
+hold_task_mempolicy(priv);
+
+for (vma = mm->mmap; vma != priv->tail_vma; vma = vma->vm_next) {
+struct mem_size_stats mss;
+struct mm_walk smaps_walk = {
+.pmd_entry = smaps_pte_range,
+.mm = vma->vm_mm,
+.private = ,
+};
+
+if (vma->vm_mm && !is_vm_hugetlb_page(vma)) {
+memset(, 0, sizeof(mss));
+walk_page_vma(vma, _walk);
+add_smaps_sum(, _sum);
+}
+}
+
+release_task_mempolicy(priv);
+up_read(>mmap_sem);
+
+seq_printf(m,
+   "Rss:%8lu kB\n"
+   "Pss:%8lu kB\n"
+   "Shared_Clean:   %8lu kB\n"
+   "Shared_Dirty:   %8lu kB\n"
+   "Private_Clean:  %8lu kB\n"
+   "Private_Dirty:  %8lu kB\n"
+   "Referenced: %8lu kB\n"
+   "Anonymous:  %8lu kB\n"
+   "AnonHugePages:  %8lu kB\n"
+   "Swap:   %8lu kB\n",
+   mss_sum.resident >> 10,
+   (unsigned long)(mss_sum.pss >> (10 + PSS_SHIFT)),
+   mss_sum.shared_clean  >> 10,
+   mss_sum.shared_dirty  >> 10,
+   mss_sum.private_clean >> 10,
+   mss_sum.private_dirty >> 10,
+   mss_sum.referenced >> 10,
+   mss_sum.anonymous >> 10,
+   mss_sum.anonymous_thp >> 10,
+   mss_sum.swap >> 10);
+
+return 0;
+}
+
 static int show_pid_smap(struct seq_file *m, void *v)
 {
 return show_smap(m, v, 1);
@@

Re: [PACTH v1] kernel/hung_task.c: Dump all UNINTERUPTIBLE tasks

2016-08-29 Thread Robert Foss




On 2016-08-11 12:35 PM, Robert Foss wrote:



On 2016-08-10 06:43 PM, Andrew Morton wrote:

On Tue,  2 Aug 2016 11:23:11 -0400 robert.f...@collabora.com wrote:


From: Aaron Durbin <adur...@chromium.org>

When the panic path is taken for khungtaskd dump all
tasks with the UNINTERUPTIBLE state. That way, any
inter-dependent tasks that caused one another to hang
will be saved in the crash output.

...

--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -122,6 +122,8 @@ static void check_hung_task(struct task_struct
*t, unsigned long timeout)
 touch_nmi_watchdog();

 if (sysctl_hung_task_panic) {
+/* Dump all tasks. */
+show_state_filter(TASK_UNINTERRUPTIBLE);
 trigger_all_cpu_backtrace();
 panic("hung_task: blocked tasks");
 }


Well, it's going to produce more gunk for the operator to read through
and understand.

I'd like to hear a little more about the value of this change: what
particular problem prompted it, etc.



It would indeed provide more gunk. What makes it useful is that is on
enabled by default and enables rapid debugging of devices that are not
physically accessible or accessible for debugging otherwise.

So the primary usecase would be when a user of a device is seeing some
issues and submits the logs from the device.
Without any further action from the user, the problem could potentially
be solved.


The debug output could be formatted better, would that make this patch 
more appealing?

Re: [PACTH v1] kernel/hung_task.c: Dump all UNINTERUPTIBLE tasks

2016-08-29 Thread Robert Foss




On 2016-08-11 12:35 PM, Robert Foss wrote:



On 2016-08-10 06:43 PM, Andrew Morton wrote:

On Tue,  2 Aug 2016 11:23:11 -0400 robert.f...@collabora.com wrote:


From: Aaron Durbin 

When the panic path is taken for khungtaskd dump all
tasks with the UNINTERUPTIBLE state. That way, any
inter-dependent tasks that caused one another to hang
will be saved in the crash output.

...

--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -122,6 +122,8 @@ static void check_hung_task(struct task_struct
*t, unsigned long timeout)
 touch_nmi_watchdog();

 if (sysctl_hung_task_panic) {
+/* Dump all tasks. */
+show_state_filter(TASK_UNINTERRUPTIBLE);
 trigger_all_cpu_backtrace();
 panic("hung_task: blocked tasks");
 }


Well, it's going to produce more gunk for the operator to read through
and understand.

I'd like to hear a little more about the value of this change: what
particular problem prompted it, etc.



It would indeed provide more gunk. What makes it useful is that is on
enabled by default and enables rapid debugging of devices that are not
physically accessible or accessible for debugging otherwise.

So the primary usecase would be when a user of a device is seeing some
issues and submits the logs from the device.
Without any further action from the user, the problem could potentially
be solved.


The debug output could be formatted better, would that make this patch 
more appealing?

Re: [PATCH v1] mm, sysctl: Add sysctl for controlling VM_MAYEXEC taint

2016-08-29 Thread Robert Foss

On 2016-08-29 11:25 AM, Will Drewry wrote:

On Fri, Aug 26, 2016 at 4:32 PM, Kirill A. Shutemov
> wrote:

On Fri, Aug 26, 2016 at 12:30:04PM -0400, robert.f...@collabora.com
 wrote:
> From: Will Drewry >
>
> This patch proposes a sysctl knob that allows a privileged user to
> disable ~VM_MAYEXEC tainting when mapping in a vma from a MNT_NOEXEC
> mountpoint.  It does not alter the normal behavior resulting from
> attempting to directly mmap(PROT_EXEC) a vma (-EPERM) nor the behavior
> of any other subsystems checking MNT_NOEXEC.

Wouldn't it be equal to remounting all filesystems without noexec from
attacker POV? It's hardly a fence to make additional mprotect(PROT_EXEC)
call, before starting executing code from such filesystems.

If administrator of the system wants this, he can just mount filesystem
without noexec, no new kernel code required. And it's more fine-grained
than this.

So, no, I don't think we should add knob like this. Unless I miss
something.

I don't believe this patch is necessary anymore (though, thank you
Robert for testing and re-sending!).

The primary offenders wrt to needing to mmap/mprotect a file in /dev/shm
was the older nvidia
driver (binary only iirc) and the Chrome Native Client code.

The reason why half-exec is an "ok" (half) mitigation is because it
blocks simple gadgets and other paths for using loadable libraries or
binaries (via glibc) as it disallows mmap(PROT_EXEC) even though it
allows mprotect(PROT_EXEC).  This stops ld in its tracks since it does
the obvious thing and uses mmap(PROT_EXEC).

I think time has marched on and this patch is now something I can toss
in the dustbin of history. Both Chrome's Native Client and an older
nvidia driver relied on creating-then-unlinking a file in tmpfs, but
there is now a better facility!

NAK.

Agreed - this is old and software that predicated it should be gone.. I
hope. :)

Splendid, patch dropped!
Thanks Will and Kirill!

Rob.

> It is motivated by a common /dev/shm, /tmp usecase. There are few
> facilities for creating a shared memory segment that can be remapped in
> the same process address space with different permissions.

What about using memfd_create(2) for such cases? You'll get a file
descriptor from in-kernel tmpfs (shm_mnt) which is not exposed to
userspace for remount as noexec.

This is a relatively old patch ( https://lwn.net/Articles/455256/
 ) which predated memfd_create().
 memfd_create() is the right solution to this problem!

Thanks again!
will

Re: [PATCH v1] mm, sysctl: Add sysctl for controlling VM_MAYEXEC taint

2016-08-29 Thread Robert Foss

On 2016-08-29 11:25 AM, Will Drewry wrote:

On Fri, Aug 26, 2016 at 4:32 PM, Kirill A. Shutemov
mailto:kir...@shutemov.name>> wrote:

On Fri, Aug 26, 2016 at 12:30:04PM -0400, robert.f...@collabora.com
 wrote:
> From: Will Drewry mailto:w...@chromium.org>>
>
> This patch proposes a sysctl knob that allows a privileged user to
> disable ~VM_MAYEXEC tainting when mapping in a vma from a MNT_NOEXEC
> mountpoint.  It does not alter the normal behavior resulting from
> attempting to directly mmap(PROT_EXEC) a vma (-EPERM) nor the behavior
> of any other subsystems checking MNT_NOEXEC.

Wouldn't it be equal to remounting all filesystems without noexec from
attacker POV? It's hardly a fence to make additional mprotect(PROT_EXEC)
call, before starting executing code from such filesystems.

If administrator of the system wants this, he can just mount filesystem
without noexec, no new kernel code required. And it's more fine-grained
than this.

So, no, I don't think we should add knob like this. Unless I miss
something.

I don't believe this patch is necessary anymore (though, thank you
Robert for testing and re-sending!).

The primary offenders wrt to needing to mmap/mprotect a file in /dev/shm
was the older nvidia
driver (binary only iirc) and the Chrome Native Client code.

The reason why half-exec is an "ok" (half) mitigation is because it
blocks simple gadgets and other paths for using loadable libraries or
binaries (via glibc) as it disallows mmap(PROT_EXEC) even though it
allows mprotect(PROT_EXEC).  This stops ld in its tracks since it does
the obvious thing and uses mmap(PROT_EXEC).

I think time has marched on and this patch is now something I can toss
in the dustbin of history. Both Chrome's Native Client and an older
nvidia driver relied on creating-then-unlinking a file in tmpfs, but
there is now a better facility!

NAK.

Agreed - this is old and software that predicated it should be gone.. I
hope. :)

Splendid, patch dropped!
Thanks Will and Kirill!

Rob.

> It is motivated by a common /dev/shm, /tmp usecase. There are few
> facilities for creating a shared memory segment that can be remapped in
> the same process address space with different permissions.

What about using memfd_create(2) for such cases? You'll get a file
descriptor from in-kernel tmpfs (shm_mnt) which is not exposed to
userspace for remount as noexec.

This is a relatively old patch ( https://lwn.net/Articles/455256/
 ) which predated memfd_create().
 memfd_create() is the right solution to this problem!

Thanks again!
will

[PATCH v3 2/5] net: asix: Avoid looping when the device is disconnected

2016-08-29 Thread robert . foss

From: Vincent Palatin <vpala...@chromium.org>

Check the answers from the USB stack and avoid re-sending multiple times
the request if the device has disappeared.

Signed-off-by: Vincent Palatin <vpala...@chromium.org>
Signed-off-by: Robert Foss <robert.f...@collabora.com>
Tested-by: Robert Foss <robert.f...@collabora.com>
---
 drivers/net/usb/asix_common.c  | 56 +-
 drivers/net/usb/asix_devices.c |  2 ++
 2 files changed, 46 insertions(+), 12 deletions(-)

diff --git a/drivers/net/usb/asix_common.c b/drivers/net/usb/asix_common.c
index 25609ee..f79eb12 100644
--- a/drivers/net/usb/asix_common.c
+++ b/drivers/net/usb/asix_common.c
@@ -428,13 +428,21 @@ int asix_mdio_read(struct net_device *netdev, int phy_id, 
int loc)
__le16 res;
u8 smsr;
int i = 0;
+   int ret;
 
mutex_lock(>phy_mutex);
do {
-   asix_set_sw_mii(dev, 0);
+   ret = asix_set_sw_mii(dev, 0);
+   if (ret == -ENODEV)
+   break;
usleep_range(1000, 1100);
-   asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG, 0, 0, 1, , 0);
-   } while (!(smsr & AX_HOST_EN) && (i++ < 30));
+   ret = asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG,
+   0, 0, 1, , 0);
+   } while (!(smsr & AX_HOST_EN) && (i++ < 30) && (ret != -ENODEV));
+   if (ret == -ENODEV) {
+   mutex_unlock(>phy_mutex);
+   return ret;
+   }
 
asix_read_cmd(dev, AX_CMD_READ_MII_REG, phy_id,
(__u16)loc, 2, , 0);
@@ -453,16 +461,24 @@ void asix_mdio_write(struct net_device *netdev, int 
phy_id, int loc, int val)
__le16 res = cpu_to_le16(val);
u8 smsr;
int i = 0;
+   int ret;
 
netdev_dbg(dev->net, "asix_mdio_write() phy_id=0x%02x, loc=0x%02x, 
val=0x%04x\n",
phy_id, loc, val);
 
mutex_lock(>phy_mutex);
do {
-   asix_set_sw_mii(dev, 0);
+   ret = asix_set_sw_mii(dev, 0);
+   if (ret == -ENODEV)
+   break;
usleep_range(1000, 1100);
-   asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG, 0, 0, 1, , 0);
-   } while (!(smsr & AX_HOST_EN) && (i++ < 30));
+   ret = asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG,
+   0, 0, 1, , 0);
+   } while (!(smsr & AX_HOST_EN) && (i++ < 30) && (ret != -ENODEV));
+   if (ret == -ENODEV) {
+   mutex_unlock(>phy_mutex);
+   return;
+   }
 
asix_write_cmd(dev, AX_CMD_WRITE_MII_REG, phy_id,
   (__u16)loc, 2, , 0);
@@ -476,13 +492,21 @@ int asix_mdio_read_nopm(struct net_device *netdev, int 
phy_id, int loc)
__le16 res;
u8 smsr;
int i = 0;
+   int ret;
 
mutex_lock(>phy_mutex);
do {
-   asix_set_sw_mii(dev, 1);
+   ret = asix_set_sw_mii(dev, 1);
+   if (ret == -ENODEV)
+   break;
usleep_range(1000, 1100);
-   asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG, 0, 0, 1, , 1);
-   } while (!(smsr & AX_HOST_EN) && (i++ < 30));
+   ret = asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG,
+   0, 0, 1, , 1);
+   } while (!(smsr & AX_HOST_EN) && (i++ < 30) && (ret != -ENODEV));
+   if (ret == -ENODEV) {
+   mutex_unlock(>phy_mutex);
+   return ret;
+   }
 
asix_read_cmd(dev, AX_CMD_READ_MII_REG, phy_id,
  (__u16)loc, 2, , 1);
@@ -502,16 +526,24 @@ asix_mdio_write_nopm(struct net_device *netdev, int 
phy_id, int loc, int val)
__le16 res = cpu_to_le16(val);
u8 smsr;
int i = 0;
+   int ret;
 
netdev_dbg(dev->net, "asix_mdio_write() phy_id=0x%02x, loc=0x%02x, 
val=0x%04x\n",
phy_id, loc, val);
 
mutex_lock(>phy_mutex);
do {
-   asix_set_sw_mii(dev, 1);
+   ret = asix_set_sw_mii(dev, 1);
+   if (ret == -ENODEV)
+   break;
usleep_range(1000, 1100);
-   asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG, 0, 0, 1, , 1);
-   } while (!(smsr & AX_HOST_EN) && (i++ < 30));
+   ret = asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG,
+   0, 0, 1, , 1);
+   } while (!(smsr & AX_HOST_EN) && (i++ < 30) && (ret != -ENODEV));
+   if (ret == -ENODEV) {
+   mutex_unlock(>phy_mutex);
+   return;
+   }
 
asix_write_cmd(dev, AX_CMD_WRITE_MII_REG, phy_id,
   (__u16)loc, 2, , 1);
diff --git a

[PATCH v3 2/5] net: asix: Avoid looping when the device is disconnected

2016-08-29 Thread robert . foss

From: Vincent Palatin 

Check the answers from the USB stack and avoid re-sending multiple times
the request if the device has disappeared.

Signed-off-by: Vincent Palatin 
Signed-off-by: Robert Foss 
Tested-by: Robert Foss 
---
 drivers/net/usb/asix_common.c  | 56 +-
 drivers/net/usb/asix_devices.c |  2 ++
 2 files changed, 46 insertions(+), 12 deletions(-)

diff --git a/drivers/net/usb/asix_common.c b/drivers/net/usb/asix_common.c
index 25609ee..f79eb12 100644
--- a/drivers/net/usb/asix_common.c
+++ b/drivers/net/usb/asix_common.c
@@ -428,13 +428,21 @@ int asix_mdio_read(struct net_device *netdev, int phy_id, 
int loc)
__le16 res;
u8 smsr;
int i = 0;
+   int ret;
 
mutex_lock(>phy_mutex);
do {
-   asix_set_sw_mii(dev, 0);
+   ret = asix_set_sw_mii(dev, 0);
+   if (ret == -ENODEV)
+   break;
usleep_range(1000, 1100);
-   asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG, 0, 0, 1, , 0);
-   } while (!(smsr & AX_HOST_EN) && (i++ < 30));
+   ret = asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG,
+   0, 0, 1, , 0);
+   } while (!(smsr & AX_HOST_EN) && (i++ < 30) && (ret != -ENODEV));
+   if (ret == -ENODEV) {
+   mutex_unlock(>phy_mutex);
+   return ret;
+   }
 
asix_read_cmd(dev, AX_CMD_READ_MII_REG, phy_id,
(__u16)loc, 2, , 0);
@@ -453,16 +461,24 @@ void asix_mdio_write(struct net_device *netdev, int 
phy_id, int loc, int val)
__le16 res = cpu_to_le16(val);
u8 smsr;
int i = 0;
+   int ret;
 
netdev_dbg(dev->net, "asix_mdio_write() phy_id=0x%02x, loc=0x%02x, 
val=0x%04x\n",
phy_id, loc, val);
 
mutex_lock(>phy_mutex);
do {
-   asix_set_sw_mii(dev, 0);
+   ret = asix_set_sw_mii(dev, 0);
+   if (ret == -ENODEV)
+   break;
usleep_range(1000, 1100);
-   asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG, 0, 0, 1, , 0);
-   } while (!(smsr & AX_HOST_EN) && (i++ < 30));
+   ret = asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG,
+   0, 0, 1, , 0);
+   } while (!(smsr & AX_HOST_EN) && (i++ < 30) && (ret != -ENODEV));
+   if (ret == -ENODEV) {
+   mutex_unlock(>phy_mutex);
+   return;
+   }
 
asix_write_cmd(dev, AX_CMD_WRITE_MII_REG, phy_id,
   (__u16)loc, 2, , 0);
@@ -476,13 +492,21 @@ int asix_mdio_read_nopm(struct net_device *netdev, int 
phy_id, int loc)
__le16 res;
u8 smsr;
int i = 0;
+   int ret;
 
mutex_lock(>phy_mutex);
do {
-   asix_set_sw_mii(dev, 1);
+   ret = asix_set_sw_mii(dev, 1);
+   if (ret == -ENODEV)
+   break;
usleep_range(1000, 1100);
-   asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG, 0, 0, 1, , 1);
-   } while (!(smsr & AX_HOST_EN) && (i++ < 30));
+   ret = asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG,
+   0, 0, 1, , 1);
+   } while (!(smsr & AX_HOST_EN) && (i++ < 30) && (ret != -ENODEV));
+   if (ret == -ENODEV) {
+   mutex_unlock(>phy_mutex);
+   return ret;
+   }
 
asix_read_cmd(dev, AX_CMD_READ_MII_REG, phy_id,
  (__u16)loc, 2, , 1);
@@ -502,16 +526,24 @@ asix_mdio_write_nopm(struct net_device *netdev, int 
phy_id, int loc, int val)
__le16 res = cpu_to_le16(val);
u8 smsr;
int i = 0;
+   int ret;
 
netdev_dbg(dev->net, "asix_mdio_write() phy_id=0x%02x, loc=0x%02x, 
val=0x%04x\n",
phy_id, loc, val);
 
mutex_lock(>phy_mutex);
do {
-   asix_set_sw_mii(dev, 1);
+   ret = asix_set_sw_mii(dev, 1);
+   if (ret == -ENODEV)
+   break;
usleep_range(1000, 1100);
-   asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG, 0, 0, 1, , 1);
-   } while (!(smsr & AX_HOST_EN) && (i++ < 30));
+   ret = asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG,
+   0, 0, 1, , 1);
+   } while (!(smsr & AX_HOST_EN) && (i++ < 30) && (ret != -ENODEV));
+   if (ret == -ENODEV) {
+   mutex_unlock(>phy_mutex);
+   return;
+   }
 
asix_write_cmd(dev, AX_CMD_WRITE_MII_REG, phy_id,
   (__u16)loc, 2, , 1);
diff --git a/drivers/net/usb/asix_devices.c b/drivers/net/usb/asix_devices.c
index aaa4290..ebeb730 100644
--- a/drivers/net/usb/asix

[PATCH v3 5/5] net: asix: autoneg will set WRITE_MEDIUM reg

2016-08-29 Thread robert . foss

From: Grant Grundler <grund...@chromium.org>

The miii_nway_restart() causes a PHY link change activity and
ax88772_link_reset will be called. link_reset will set
AX_CMD_WRITE_MEDIUM_MODE register correctly.

The asix_write_medium_mode in reset() fills in a default value to the register
which may be different from the negotiation result. So do this first.

Ignore the ret value since it's ignored in XXX_link_reset() functions.

Signed-off-by: Grant Grundler <grund...@google.com>
Signed-off-by: Robert Foss <robert.f...@collabora.com>
Tested-by: Robert Foss <robert.f...@collabora.com>
---
 drivers/net/usb/asix_devices.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/net/usb/asix_devices.c b/drivers/net/usb/asix_devices.c
index dbcdda2..cce2495 100644
--- a/drivers/net/usb/asix_devices.c
+++ b/drivers/net/usb/asix_devices.c
@@ -928,12 +928,9 @@ static int ax88178_reset(struct usbnet *dev)
asix_mdio_write(dev->net, dev->mii.phy_id, MII_CTRL1000,
ADVERTISE_1000FULL);
 
+   asix_write_medium_mode(dev, AX88178_MEDIUM_DEFAULT, 0);
mii_nway_restart(>mii);
 
-   ret = asix_write_medium_mode(dev, AX88178_MEDIUM_DEFAULT, 0);
-   if (ret < 0)
-   return ret;
-
/* Rewrite MAC address */
memcpy(data->mac_addr, dev->net->dev_addr, ETH_ALEN);
ret = asix_write_cmd(dev, AX_CMD_WRITE_NODE_ID, 0, 0, ETH_ALEN,
-- 
2.7.4

[PATCH v3 5/5] net: asix: autoneg will set WRITE_MEDIUM reg

2016-08-29 Thread robert . foss

From: Grant Grundler 

The miii_nway_restart() causes a PHY link change activity and
ax88772_link_reset will be called. link_reset will set
AX_CMD_WRITE_MEDIUM_MODE register correctly.

The asix_write_medium_mode in reset() fills in a default value to the register
which may be different from the negotiation result. So do this first.

Ignore the ret value since it's ignored in XXX_link_reset() functions.

Signed-off-by: Grant Grundler 
Signed-off-by: Robert Foss 
Tested-by: Robert Foss 
---
 drivers/net/usb/asix_devices.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/net/usb/asix_devices.c b/drivers/net/usb/asix_devices.c
index dbcdda2..cce2495 100644
--- a/drivers/net/usb/asix_devices.c
+++ b/drivers/net/usb/asix_devices.c
@@ -928,12 +928,9 @@ static int ax88178_reset(struct usbnet *dev)
asix_mdio_write(dev->net, dev->mii.phy_id, MII_CTRL1000,
ADVERTISE_1000FULL);
 
+   asix_write_medium_mode(dev, AX88178_MEDIUM_DEFAULT, 0);
mii_nway_restart(>mii);
 
-   ret = asix_write_medium_mode(dev, AX88178_MEDIUM_DEFAULT, 0);
-   if (ret < 0)
-   return ret;
-
/* Rewrite MAC address */
memcpy(data->mac_addr, dev->net->dev_addr, ETH_ALEN);
ret = asix_write_cmd(dev, AX_CMD_WRITE_NODE_ID, 0, 0, ETH_ALEN,
-- 
2.7.4

[PATCH v3 3/5] net: asix: Fix AX88772x resume failures

2016-08-29 Thread robert . foss

From: Allan Chou <al...@asix.com.tw>

The change fixes AX88772x resume failure by
- Restore incorrect AX88772A PHY registers when resetting
- Need to stop MAC operation when suspending
- Need to restart MII when restoring PHY

Signed-off-by: Allan Chou <al...@asix.com.tw>
Signed-off-by: Robert Foss <robert.f...@collabora.com>
Tested-by: Robert Foss <robert.f...@collabora.com>
---
 drivers/net/usb/asix_devices.c | 47 +-
 1 file changed, 46 insertions(+), 1 deletion(-)

diff --git a/drivers/net/usb/asix_devices.c b/drivers/net/usb/asix_devices.c
index ebeb730..083dc2e 100644
--- a/drivers/net/usb/asix_devices.c
+++ b/drivers/net/usb/asix_devices.c
@@ -35,6 +35,15 @@
 
 #definePHY_MODE_RTL8211CL  0x000C
 
+#define AX88772A_PHY14H0x14
+#define AX88772A_PHY14H_DEFAULT 0x442C
+
+#define AX88772A_PHY15H0x15
+#define AX88772A_PHY15H_DEFAULT 0x03C8
+
+#define AX88772A_PHY16H0x16
+#define AX88772A_PHY16H_DEFAULT 0x4044
+
 struct ax88172_int_data {
__le16 res1;
u8 link;
@@ -424,7 +433,7 @@ static int ax88772a_hw_reset(struct usbnet *dev, int in_pm)
 {
struct asix_data *data = (struct asix_data *)>data;
int ret, embd_phy;
-   u16 rx_ctl;
+   u16 rx_ctl, phy14h, phy15h, phy16h;
u8 chipcode = 0;
 
ret = asix_write_gpio(dev, AX_GPIO_RSE, 5, in_pm);
@@ -482,6 +491,32 @@ static int ax88772a_hw_reset(struct usbnet *dev, int in_pm)
   ret);
goto out;
}
+   } else if ((chipcode & AX_CHIPCODE_MASK) == AX_AX88772A_CHIPCODE) {
+   /* Check if the PHY registers have default settings */
+   phy14h = asix_mdio_read_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY14H);
+   phy15h = asix_mdio_read_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY15H);
+   phy16h = asix_mdio_read_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY16H);
+
+   netdev_dbg(dev->net,
+  "772a_hw_reset: MR20=0x%x MR21=0x%x MR22=0x%x\n",
+  phy14h, phy15h, phy16h);
+
+   /* Restore PHY registers default setting if not */
+   if (phy14h != AX88772A_PHY14H_DEFAULT)
+   asix_mdio_write_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY14H,
+AX88772A_PHY14H_DEFAULT);
+   if (phy15h != AX88772A_PHY15H_DEFAULT)
+   asix_mdio_write_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY15H,
+AX88772A_PHY15H_DEFAULT);
+   if (phy16h != AX88772A_PHY16H_DEFAULT)
+   asix_mdio_write_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY16H,
+AX88772A_PHY16H_DEFAULT);
}
 
ret = asix_write_cmd(dev, AX_CMD_WRITE_IPG0,
@@ -543,6 +578,15 @@ static const struct net_device_ops ax88772_netdev_ops = {
 static void ax88772_suspend(struct usbnet *dev)
 {
struct asix_common_private *priv = dev->driver_priv;
+   u16 medium;
+
+   /* Stop MAC operation */
+   medium = asix_read_medium_status(dev, 0);
+   medium &= ~AX_MEDIUM_RE;
+   asix_write_medium_mode(dev, medium, 0);
+
+   netdev_dbg(dev->net, "ax88772_suspend: medium=0x%04x\n",
+  asix_read_medium_status(dev, 0));
 
/* Preserve BMCR for restoring */
priv->presvd_phy_bmcr =
@@ -577,6 +621,7 @@ static void ax88772_restore_phy(struct usbnet *dev)
asix_mdio_write_nopm(dev->net, dev->mii.phy_id, MII_BMCR,
 priv->presvd_phy_bmcr);
 
+   mii_nway_restart(>mii);
priv->presvd_phy_advertise = 0;
priv->presvd_phy_bmcr = 0;
}
-- 
2.7.4

[PATCH v3 4/5] net: asix: see 802.3 spec for phy reset

2016-08-29 Thread robert . foss

From: Grant Grundler <grund...@chromium.org>

https://lkml.org/lkml/2014/11/11/947

Ben Hutchings is correct. IEEE 802.3 spec section "22.2.4.1.1 Reset" requires
up to 500ms delay. Mitigate the "max" delay by polling the phy until BCM_RESET
bit is clear.

Signed-off-by: Grant Grundler <grund...@chromium.org>
Signed-off-by: Robert Foss <robert.f...@collabora.com>
Tested-by: Robert Foss <robert.f...@collabora.com>
---
 drivers/net/usb/asix_devices.c | 27 ---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/drivers/net/usb/asix_devices.c b/drivers/net/usb/asix_devices.c
index 083dc2e..dbcdda2 100644
--- a/drivers/net/usb/asix_devices.c
+++ b/drivers/net/usb/asix_devices.c
@@ -212,6 +212,28 @@ static const struct net_device_ops ax88172_netdev_ops = {
.ndo_set_rx_mode= ax88172_set_multicast,
 };
 
+static void asix_phy_reset(struct usbnet *dev, unsigned int reset_bits)
+{
+   unsigned int timeout = 5000;
+
+   asix_mdio_write(dev->net, dev->mii.phy_id, MII_BMCR, reset_bits);
+
+   /* give phy_id a chance to process reset */
+   udelay(500);
+
+   /* See IEEE 802.3 "22.2.4.1.1 Reset": 500ms max */
+   while (timeout--) {
+   if (asix_mdio_read(dev->net, dev->mii.phy_id, MII_BMCR)
+   & BMCR_RESET)
+   udelay(100);
+   else
+   return;
+   }
+
+   netdev_err(dev->net, "BMCR_RESET timeout on phy_id %d\n",
+  dev->mii.phy_id);
+}
+
 static int ax88172_bind(struct usbnet *dev, struct usb_interface *intf)
 {
int ret = 0;
@@ -258,7 +280,7 @@ static int ax88172_bind(struct usbnet *dev, struct 
usb_interface *intf)
dev->net->needed_headroom = 4; /* cf asix_tx_fixup() */
dev->net->needed_tailroom = 4; /* cf asix_tx_fixup() */
 
-   asix_mdio_write(dev->net, dev->mii.phy_id, MII_BMCR, BMCR_RESET);
+   asix_phy_reset(dev, BMCR_RESET);
asix_mdio_write(dev->net, dev->mii.phy_id, MII_ADVERTISE,
ADVERTISE_ALL | ADVERTISE_CSMA | ADVERTISE_PAUSE_CAP);
mii_nway_restart(>mii);
@@ -900,8 +922,7 @@ static int ax88178_reset(struct usbnet *dev)
} else if (data->phymode == PHY_MODE_RTL8211CL)
rtl8211cl_phy_init(dev);
 
-   asix_mdio_write(dev->net, dev->mii.phy_id, MII_BMCR,
-   BMCR_RESET | BMCR_ANENABLE);
+   asix_phy_reset(dev, BMCR_RESET | BMCR_ANENABLE);
asix_mdio_write(dev->net, dev->mii.phy_id, MII_ADVERTISE,
ADVERTISE_ALL | ADVERTISE_CSMA | ADVERTISE_PAUSE_CAP);
asix_mdio_write(dev->net, dev->mii.phy_id, MII_CTRL1000,
-- 
2.7.4

[PATCH v3 3/5] net: asix: Fix AX88772x resume failures

2016-08-29 Thread robert . foss

From: Allan Chou 

The change fixes AX88772x resume failure by
- Restore incorrect AX88772A PHY registers when resetting
- Need to stop MAC operation when suspending
- Need to restart MII when restoring PHY

Signed-off-by: Allan Chou 
Signed-off-by: Robert Foss 
Tested-by: Robert Foss 
---
 drivers/net/usb/asix_devices.c | 47 +-
 1 file changed, 46 insertions(+), 1 deletion(-)

diff --git a/drivers/net/usb/asix_devices.c b/drivers/net/usb/asix_devices.c
index ebeb730..083dc2e 100644
--- a/drivers/net/usb/asix_devices.c
+++ b/drivers/net/usb/asix_devices.c
@@ -35,6 +35,15 @@
 
 #definePHY_MODE_RTL8211CL  0x000C
 
+#define AX88772A_PHY14H0x14
+#define AX88772A_PHY14H_DEFAULT 0x442C
+
+#define AX88772A_PHY15H0x15
+#define AX88772A_PHY15H_DEFAULT 0x03C8
+
+#define AX88772A_PHY16H0x16
+#define AX88772A_PHY16H_DEFAULT 0x4044
+
 struct ax88172_int_data {
__le16 res1;
u8 link;
@@ -424,7 +433,7 @@ static int ax88772a_hw_reset(struct usbnet *dev, int in_pm)
 {
struct asix_data *data = (struct asix_data *)>data;
int ret, embd_phy;
-   u16 rx_ctl;
+   u16 rx_ctl, phy14h, phy15h, phy16h;
u8 chipcode = 0;
 
ret = asix_write_gpio(dev, AX_GPIO_RSE, 5, in_pm);
@@ -482,6 +491,32 @@ static int ax88772a_hw_reset(struct usbnet *dev, int in_pm)
   ret);
goto out;
}
+   } else if ((chipcode & AX_CHIPCODE_MASK) == AX_AX88772A_CHIPCODE) {
+   /* Check if the PHY registers have default settings */
+   phy14h = asix_mdio_read_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY14H);
+   phy15h = asix_mdio_read_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY15H);
+   phy16h = asix_mdio_read_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY16H);
+
+   netdev_dbg(dev->net,
+  "772a_hw_reset: MR20=0x%x MR21=0x%x MR22=0x%x\n",
+  phy14h, phy15h, phy16h);
+
+   /* Restore PHY registers default setting if not */
+   if (phy14h != AX88772A_PHY14H_DEFAULT)
+   asix_mdio_write_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY14H,
+AX88772A_PHY14H_DEFAULT);
+   if (phy15h != AX88772A_PHY15H_DEFAULT)
+   asix_mdio_write_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY15H,
+AX88772A_PHY15H_DEFAULT);
+   if (phy16h != AX88772A_PHY16H_DEFAULT)
+   asix_mdio_write_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY16H,
+AX88772A_PHY16H_DEFAULT);
}
 
ret = asix_write_cmd(dev, AX_CMD_WRITE_IPG0,
@@ -543,6 +578,15 @@ static const struct net_device_ops ax88772_netdev_ops = {
 static void ax88772_suspend(struct usbnet *dev)
 {
struct asix_common_private *priv = dev->driver_priv;
+   u16 medium;
+
+   /* Stop MAC operation */
+   medium = asix_read_medium_status(dev, 0);
+   medium &= ~AX_MEDIUM_RE;
+   asix_write_medium_mode(dev, medium, 0);
+
+   netdev_dbg(dev->net, "ax88772_suspend: medium=0x%04x\n",
+  asix_read_medium_status(dev, 0));
 
/* Preserve BMCR for restoring */
priv->presvd_phy_bmcr =
@@ -577,6 +621,7 @@ static void ax88772_restore_phy(struct usbnet *dev)
asix_mdio_write_nopm(dev->net, dev->mii.phy_id, MII_BMCR,
 priv->presvd_phy_bmcr);
 
+   mii_nway_restart(>mii);
priv->presvd_phy_advertise = 0;
priv->presvd_phy_bmcr = 0;
}
-- 
2.7.4

[PATCH v3 4/5] net: asix: see 802.3 spec for phy reset

2016-08-29 Thread robert . foss

From: Grant Grundler 

https://lkml.org/lkml/2014/11/11/947

Ben Hutchings is correct. IEEE 802.3 spec section "22.2.4.1.1 Reset" requires
up to 500ms delay. Mitigate the "max" delay by polling the phy until BCM_RESET
bit is clear.

Signed-off-by: Grant Grundler 
Signed-off-by: Robert Foss 
Tested-by: Robert Foss 
---
 drivers/net/usb/asix_devices.c | 27 ---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/drivers/net/usb/asix_devices.c b/drivers/net/usb/asix_devices.c
index 083dc2e..dbcdda2 100644
--- a/drivers/net/usb/asix_devices.c
+++ b/drivers/net/usb/asix_devices.c
@@ -212,6 +212,28 @@ static const struct net_device_ops ax88172_netdev_ops = {
.ndo_set_rx_mode= ax88172_set_multicast,
 };
 
+static void asix_phy_reset(struct usbnet *dev, unsigned int reset_bits)
+{
+   unsigned int timeout = 5000;
+
+   asix_mdio_write(dev->net, dev->mii.phy_id, MII_BMCR, reset_bits);
+
+   /* give phy_id a chance to process reset */
+   udelay(500);
+
+   /* See IEEE 802.3 "22.2.4.1.1 Reset": 500ms max */
+   while (timeout--) {
+   if (asix_mdio_read(dev->net, dev->mii.phy_id, MII_BMCR)
+   & BMCR_RESET)
+   udelay(100);
+   else
+   return;
+   }
+
+   netdev_err(dev->net, "BMCR_RESET timeout on phy_id %d\n",
+  dev->mii.phy_id);
+}
+
 static int ax88172_bind(struct usbnet *dev, struct usb_interface *intf)
 {
int ret = 0;
@@ -258,7 +280,7 @@ static int ax88172_bind(struct usbnet *dev, struct 
usb_interface *intf)
dev->net->needed_headroom = 4; /* cf asix_tx_fixup() */
dev->net->needed_tailroom = 4; /* cf asix_tx_fixup() */
 
-   asix_mdio_write(dev->net, dev->mii.phy_id, MII_BMCR, BMCR_RESET);
+   asix_phy_reset(dev, BMCR_RESET);
asix_mdio_write(dev->net, dev->mii.phy_id, MII_ADVERTISE,
ADVERTISE_ALL | ADVERTISE_CSMA | ADVERTISE_PAUSE_CAP);
mii_nway_restart(>mii);
@@ -900,8 +922,7 @@ static int ax88178_reset(struct usbnet *dev)
} else if (data->phymode == PHY_MODE_RTL8211CL)
rtl8211cl_phy_init(dev);
 
-   asix_mdio_write(dev->net, dev->mii.phy_id, MII_BMCR,
-   BMCR_RESET | BMCR_ANENABLE);
+   asix_phy_reset(dev, BMCR_RESET | BMCR_ANENABLE);
asix_mdio_write(dev->net, dev->mii.phy_id, MII_ADVERTISE,
ADVERTISE_ALL | ADVERTISE_CSMA | ADVERTISE_PAUSE_CAP);
asix_mdio_write(dev->net, dev->mii.phy_id, MII_CTRL1000,
-- 
2.7.4

[PATCH v3 1/5] net: asix: Add in_pm parameter

2016-08-29 Thread robert . foss

From: Freddy Xin <fre...@asix.com.tw>

In order to R/W registers in suspend/resume functions, in_pm flags are
added to some functions to determine whether the nopm version of usb
functions is called.

Save BMCR and ANAR PHY registers in suspend function and restore them
in resume function.

Reset HW in resume function to ensure the PHY works correctly.

Signed-off-by: Freddy Xin <fre...@asix.com.tw>
Signed-off-by: Robert Foss <robert.f...@collabora.com>
Tested-by: Robert Foss <robert.f...@collabora.com>
---
 drivers/net/usb/asix.h |  40 +++--
 drivers/net/usb/asix_common.c  | 180 +++-
 drivers/net/usb/asix_devices.c | 373 -
 drivers/net/usb/ax88172a.c |  29 ++--
 4 files changed, 472 insertions(+), 150 deletions(-)

diff --git a/drivers/net/usb/asix.h b/drivers/net/usb/asix.h
index a2d3ea6..d109242 100644
--- a/drivers/net/usb/asix.h
+++ b/drivers/net/usb/asix.h
@@ -46,6 +46,7 @@
 #define AX_CMD_SET_SW_MII  0x06
 #define AX_CMD_READ_MII_REG0x07
 #define AX_CMD_WRITE_MII_REG   0x08
+#define AX_CMD_STATMNGSTS_REG  0x09
 #define AX_CMD_SET_HW_MII  0x0a
 #define AX_CMD_READ_EEPROM 0x0b
 #define AX_CMD_WRITE_EEPROM0x0c
@@ -71,6 +72,17 @@
 #define AX_CMD_SW_RESET0x20
 #define AX_CMD_SW_PHY_STATUS   0x21
 #define AX_CMD_SW_PHY_SELECT   0x22
+#define AX_QCTCTRL 0x2A
+
+#define AX_CHIPCODE_MASK   0x70
+#define AX_AX88772_CHIPCODE0x00
+#define AX_AX88772A_CHIPCODE   0x10
+#define AX_AX88772B_CHIPCODE   0x20
+#define AX_HOST_EN 0x01
+
+#define AX_PHYSEL_PSEL 0x01
+#define AX_PHYSEL_SSMII0
+#define AX_PHYSEL_SSEN 0x10
 
 #define AX_PHY_SELECT_MASK (BIT(3) | BIT(2))
 #define AX_PHY_SELECT_INTERNAL 0
@@ -173,6 +185,10 @@ struct asix_rx_fixup_info {
 };
 
 struct asix_common_private {
+   void (*resume)(struct usbnet *dev);
+   void (*suspend)(struct usbnet *dev);
+   u16 presvd_phy_advertise;
+   u16 presvd_phy_bmcr;
struct asix_rx_fixup_info rx_fixup_info;
 };
 
@@ -182,10 +198,10 @@ extern const struct driver_info ax88172a_info;
 #define FLAG_EEPROM_MAC(1UL << 0)  /* init device MAC from 
eeprom */
 
 int asix_read_cmd(struct usbnet *dev, u8 cmd, u16 value, u16 index,
- u16 size, void *data);
+ u16 size, void *data, int in_pm);
 
 int asix_write_cmd(struct usbnet *dev, u8 cmd, u16 value, u16 index,
-  u16 size, void *data);
+  u16 size, void *data, int in_pm);
 
 void asix_write_cmd_async(struct usbnet *dev, u8 cmd, u16 value,
  u16 index, u16 size, void *data);
@@ -197,27 +213,31 @@ int asix_rx_fixup_common(struct usbnet *dev, struct 
sk_buff *skb);
 struct sk_buff *asix_tx_fixup(struct usbnet *dev, struct sk_buff *skb,
  gfp_t flags);
 
-int asix_set_sw_mii(struct usbnet *dev);
-int asix_set_hw_mii(struct usbnet *dev);
+int asix_set_sw_mii(struct usbnet *dev, int in_pm);
+int asix_set_hw_mii(struct usbnet *dev, int in_pm);
 
 int asix_read_phy_addr(struct usbnet *dev, int internal);
 int asix_get_phy_addr(struct usbnet *dev);
 
-int asix_sw_reset(struct usbnet *dev, u8 flags);
+int asix_sw_reset(struct usbnet *dev, u8 flags, int in_pm);
 
-u16 asix_read_rx_ctl(struct usbnet *dev);
-int asix_write_rx_ctl(struct usbnet *dev, u16 mode);
+u16 asix_read_rx_ctl(struct usbnet *dev, int in_pm);
+int asix_write_rx_ctl(struct usbnet *dev, u16 mode, int in_pm);
 
-u16 asix_read_medium_status(struct usbnet *dev);
-int asix_write_medium_mode(struct usbnet *dev, u16 mode);
+u16 asix_read_medium_status(struct usbnet *dev, int in_pm);
+int asix_write_medium_mode(struct usbnet *dev, u16 mode, int in_pm);
 
-int asix_write_gpio(struct usbnet *dev, u16 value, int sleep);
+int asix_write_gpio(struct usbnet *dev, u16 value, int sleep, int in_pm);
 
 void asix_set_multicast(struct net_device *net);
 
 int asix_mdio_read(struct net_device *netdev, int phy_id, int loc);
 void asix_mdio_write(struct net_device *netdev, int phy_id, int loc, int val);
 
+int asix_mdio_read_nopm(struct net_device *netdev, int phy_id, int loc);
+void asix_mdio_write_nopm(struct net_device *netdev, int phy_id, int loc,
+ int val);
+
 void asix_get_wol(struct net_device *net, struct ethtool_wolinfo *wolinfo);
 int asix_set_wol(struct net_device *net, struct ethtool_wolinfo *wolinfo);
 
diff --git a/drivers/net/usb/asix_common.c b/drivers/net/usb/asix_common.c
index 7de5ab5..25609ee 100644
--- a/drivers/net/usb/asix_common.c
+++ b/drivers/net/usb/asix_common.c
@@ -22,24 +22,49 @@
 #include "asix.h"
 
 int asix_read_cmd(struct usbnet *dev, u8 cmd, u16 value, u16 index,
- u16 size, void *data)
+

[PATCH v3 1/5] net: asix: Add in_pm parameter

2016-08-29 Thread robert . foss

From: Freddy Xin 

In order to R/W registers in suspend/resume functions, in_pm flags are
added to some functions to determine whether the nopm version of usb
functions is called.

Save BMCR and ANAR PHY registers in suspend function and restore them
in resume function.

Reset HW in resume function to ensure the PHY works correctly.

Signed-off-by: Freddy Xin 
Signed-off-by: Robert Foss 
Tested-by: Robert Foss 
---
 drivers/net/usb/asix.h |  40 +++--
 drivers/net/usb/asix_common.c  | 180 +++-
 drivers/net/usb/asix_devices.c | 373 -
 drivers/net/usb/ax88172a.c |  29 ++--
 4 files changed, 472 insertions(+), 150 deletions(-)

diff --git a/drivers/net/usb/asix.h b/drivers/net/usb/asix.h
index a2d3ea6..d109242 100644
--- a/drivers/net/usb/asix.h
+++ b/drivers/net/usb/asix.h
@@ -46,6 +46,7 @@
 #define AX_CMD_SET_SW_MII  0x06
 #define AX_CMD_READ_MII_REG0x07
 #define AX_CMD_WRITE_MII_REG   0x08
+#define AX_CMD_STATMNGSTS_REG  0x09
 #define AX_CMD_SET_HW_MII  0x0a
 #define AX_CMD_READ_EEPROM 0x0b
 #define AX_CMD_WRITE_EEPROM0x0c
@@ -71,6 +72,17 @@
 #define AX_CMD_SW_RESET0x20
 #define AX_CMD_SW_PHY_STATUS   0x21
 #define AX_CMD_SW_PHY_SELECT   0x22
+#define AX_QCTCTRL 0x2A
+
+#define AX_CHIPCODE_MASK   0x70
+#define AX_AX88772_CHIPCODE0x00
+#define AX_AX88772A_CHIPCODE   0x10
+#define AX_AX88772B_CHIPCODE   0x20
+#define AX_HOST_EN 0x01
+
+#define AX_PHYSEL_PSEL 0x01
+#define AX_PHYSEL_SSMII0
+#define AX_PHYSEL_SSEN 0x10
 
 #define AX_PHY_SELECT_MASK (BIT(3) | BIT(2))
 #define AX_PHY_SELECT_INTERNAL 0
@@ -173,6 +185,10 @@ struct asix_rx_fixup_info {
 };
 
 struct asix_common_private {
+   void (*resume)(struct usbnet *dev);
+   void (*suspend)(struct usbnet *dev);
+   u16 presvd_phy_advertise;
+   u16 presvd_phy_bmcr;
struct asix_rx_fixup_info rx_fixup_info;
 };
 
@@ -182,10 +198,10 @@ extern const struct driver_info ax88172a_info;
 #define FLAG_EEPROM_MAC(1UL << 0)  /* init device MAC from 
eeprom */
 
 int asix_read_cmd(struct usbnet *dev, u8 cmd, u16 value, u16 index,
- u16 size, void *data);
+ u16 size, void *data, int in_pm);
 
 int asix_write_cmd(struct usbnet *dev, u8 cmd, u16 value, u16 index,
-  u16 size, void *data);
+  u16 size, void *data, int in_pm);
 
 void asix_write_cmd_async(struct usbnet *dev, u8 cmd, u16 value,
  u16 index, u16 size, void *data);
@@ -197,27 +213,31 @@ int asix_rx_fixup_common(struct usbnet *dev, struct 
sk_buff *skb);
 struct sk_buff *asix_tx_fixup(struct usbnet *dev, struct sk_buff *skb,
  gfp_t flags);
 
-int asix_set_sw_mii(struct usbnet *dev);
-int asix_set_hw_mii(struct usbnet *dev);
+int asix_set_sw_mii(struct usbnet *dev, int in_pm);
+int asix_set_hw_mii(struct usbnet *dev, int in_pm);
 
 int asix_read_phy_addr(struct usbnet *dev, int internal);
 int asix_get_phy_addr(struct usbnet *dev);
 
-int asix_sw_reset(struct usbnet *dev, u8 flags);
+int asix_sw_reset(struct usbnet *dev, u8 flags, int in_pm);
 
-u16 asix_read_rx_ctl(struct usbnet *dev);
-int asix_write_rx_ctl(struct usbnet *dev, u16 mode);
+u16 asix_read_rx_ctl(struct usbnet *dev, int in_pm);
+int asix_write_rx_ctl(struct usbnet *dev, u16 mode, int in_pm);
 
-u16 asix_read_medium_status(struct usbnet *dev);
-int asix_write_medium_mode(struct usbnet *dev, u16 mode);
+u16 asix_read_medium_status(struct usbnet *dev, int in_pm);
+int asix_write_medium_mode(struct usbnet *dev, u16 mode, int in_pm);
 
-int asix_write_gpio(struct usbnet *dev, u16 value, int sleep);
+int asix_write_gpio(struct usbnet *dev, u16 value, int sleep, int in_pm);
 
 void asix_set_multicast(struct net_device *net);
 
 int asix_mdio_read(struct net_device *netdev, int phy_id, int loc);
 void asix_mdio_write(struct net_device *netdev, int phy_id, int loc, int val);
 
+int asix_mdio_read_nopm(struct net_device *netdev, int phy_id, int loc);
+void asix_mdio_write_nopm(struct net_device *netdev, int phy_id, int loc,
+ int val);
+
 void asix_get_wol(struct net_device *net, struct ethtool_wolinfo *wolinfo);
 int asix_set_wol(struct net_device *net, struct ethtool_wolinfo *wolinfo);
 
diff --git a/drivers/net/usb/asix_common.c b/drivers/net/usb/asix_common.c
index 7de5ab5..25609ee 100644
--- a/drivers/net/usb/asix_common.c
+++ b/drivers/net/usb/asix_common.c
@@ -22,24 +22,49 @@
 #include "asix.h"
 
 int asix_read_cmd(struct usbnet *dev, u8 cmd, u16 value, u16 index,
- u16 size, void *data)
+ u16 size, void *data, int in_pm)
 {
int ret;
-   ret = usbnet_r

[PATCH v1] mm, sysctl: Add sysctl for controlling VM_MAYEXEC taint

2016-08-26 Thread robert . foss

From: Will Drewry <w...@chromium.org>

This patch proposes a sysctl knob that allows a privileged user to
disable ~VM_MAYEXEC tainting when mapping in a vma from a MNT_NOEXEC
mountpoint.  It does not alter the normal behavior resulting from
attempting to directly mmap(PROT_EXEC) a vma (-EPERM) nor the behavior
of any other subsystems checking MNT_NOEXEC.

It is motivated by a common /dev/shm, /tmp usecase. There are few
facilities for creating a shared memory segment that can be remapped in
the same process address space with different permissions.  Often, a
file in /tmp provides this functionality.  However, on distributions
that are more restrictive/paranoid, world-writeable directories are
often mounted "noexec".  The only workaround to support software that
needs this behavior is to either not use that software or remount /tmp
exec.  (E.g., https://bugs.gentoo.org/350336?id=350336)  Given that
the only recourse is using SysV IPC, the application programmer loses
many of the useful ABI features that they get using a mmap'd file.

With this patch, it would be possible to change the sysctl variable
such that mprotect(PROT_EXEC) would succeed.  In cases like the example
above, an additional userspace mmap-wrapper would be needed, but in
other cases, like how code.google.com/p/nativeclient mmap()s then
mprotect()s, the behavior would be unaffected.

The tradeoff is a loss of defense in depth, but it seems reasonable when
the alternative is frequently to disable the defense entirely.

(There are many other ways to approach this problem, but this seemed to
 be the most practical and feel the least like a hack or a major change.)

Signed-off-by: Will Drewry <w...@chromium.org>
Signed-off-by: Robert Foss <robert.f...@collabora.com>
Tested-by: Robert Foss <robert.f...@collabora.com>
---
 include/linux/mm.h |  2 ++
 kernel/sysctl.c|  9 +
 mm/Kconfig | 17 +
 mm/mmap.c  |  3 ++-
 mm/util.c  |  1 +
 5 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 08ed53e..e2090c5 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -108,6 +108,8 @@ extern int mmap_rnd_compat_bits __read_mostly;
 
 extern int sysctl_max_map_count;
 
+extern int sysctl_mmap_noexec_taint;
+
 extern unsigned long sysctl_user_reserve_kbytes;
 extern unsigned long sysctl_admin_reserve_kbytes;
 
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index b43d0b2..ab1d714 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1564,6 +1564,15 @@ static struct ctl_table vm_table[] = {
.mode   = 0644,
.proc_handler   = mmap_min_addr_handler,
},
+   {
+   .procname   = "mmap_noexec_taint",
+   .data   = _mmap_noexec_taint,
+   .maxlen = sizeof(sysctl_mmap_noexec_taint),
+   .mode   = 0644,
+   .proc_handler   = proc_dointvec_minmax,
+   .extra1 = ,
+   .extra2 = ,
+   },
 #endif
 #ifdef CONFIG_NUMA
{
diff --git a/mm/Kconfig b/mm/Kconfig
index 78a23c5..08d9bc8 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -353,6 +353,23 @@ config DEFAULT_MMAP_MIN_ADDR
  This value can be changed after boot using the
  /proc/sys/vm/mmap_min_addr tunable.
 
+config MMAP_NOEXEC_TAINT
+   int "Turns on tainting of mmap()d files from noexec mountpoints"
+   depends on MMU
+   default 1
+   help
+ By default, the ability to change the protections of a virtual
+ memory area to allow execution depend on if the vma has the
+ VM_MAYEXEC flag.  When mapping regions from files, VM_MAYEXEC
+ will be unset if the containing mountpoint is mounted MNT_NOEXEC.
+ By setting the value to 0, any mmap()d region may be later
+ mprotect()d with PROT_EXEC.
+
+ If unsure, keep the value set to 1.
+
+ This value can be changed after boot using the
+ /proc/sys/vm/mmap_noexec_taint tunable.
+
 config ARCH_SUPPORTS_MEMORY_FAILURE
bool
 
diff --git a/mm/mmap.c b/mm/mmap.c
index ca9d91b..b8be093 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1246,7 +1246,8 @@ unsigned long do_mmap(struct file *file, unsigned long 
addr,
if (path_noexec(>f_path)) {
if (vm_flags & VM_EXEC)
return -EPERM;
-   vm_flags &= ~VM_MAYEXEC;
+   if (sysctl_mmap_noexec_taint)
+   vm_flags &= ~VM_MAYEXEC;
}
 
if (!file->f_op->mmap)
diff --git a/mm/util.c b/mm/util.c
index 662cddf..701f0a3 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -430,6 +430,7 @@ int sysctl_overcommit_memory __read_mostly = 
OVERCOMMIT_GUESS;
 int sysctl_o

[PATCH v1] mm, sysctl: Add sysctl for controlling VM_MAYEXEC taint

2016-08-26 Thread robert . foss

From: Will Drewry 

This patch proposes a sysctl knob that allows a privileged user to
disable ~VM_MAYEXEC tainting when mapping in a vma from a MNT_NOEXEC
mountpoint.  It does not alter the normal behavior resulting from
attempting to directly mmap(PROT_EXEC) a vma (-EPERM) nor the behavior
of any other subsystems checking MNT_NOEXEC.

It is motivated by a common /dev/shm, /tmp usecase. There are few
facilities for creating a shared memory segment that can be remapped in
the same process address space with different permissions.  Often, a
file in /tmp provides this functionality.  However, on distributions
that are more restrictive/paranoid, world-writeable directories are
often mounted "noexec".  The only workaround to support software that
needs this behavior is to either not use that software or remount /tmp
exec.  (E.g., https://bugs.gentoo.org/350336?id=350336)  Given that
the only recourse is using SysV IPC, the application programmer loses
many of the useful ABI features that they get using a mmap'd file.

With this patch, it would be possible to change the sysctl variable
such that mprotect(PROT_EXEC) would succeed.  In cases like the example
above, an additional userspace mmap-wrapper would be needed, but in
other cases, like how code.google.com/p/nativeclient mmap()s then
mprotect()s, the behavior would be unaffected.

The tradeoff is a loss of defense in depth, but it seems reasonable when
the alternative is frequently to disable the defense entirely.

(There are many other ways to approach this problem, but this seemed to
 be the most practical and feel the least like a hack or a major change.)

Signed-off-by: Will Drewry 
Signed-off-by: Robert Foss 
Tested-by: Robert Foss 
---
 include/linux/mm.h |  2 ++
 kernel/sysctl.c|  9 +
 mm/Kconfig | 17 +
 mm/mmap.c  |  3 ++-
 mm/util.c  |  1 +
 5 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 08ed53e..e2090c5 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -108,6 +108,8 @@ extern int mmap_rnd_compat_bits __read_mostly;
 
 extern int sysctl_max_map_count;
 
+extern int sysctl_mmap_noexec_taint;
+
 extern unsigned long sysctl_user_reserve_kbytes;
 extern unsigned long sysctl_admin_reserve_kbytes;
 
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index b43d0b2..ab1d714 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1564,6 +1564,15 @@ static struct ctl_table vm_table[] = {
.mode   = 0644,
.proc_handler   = mmap_min_addr_handler,
},
+   {
+   .procname   = "mmap_noexec_taint",
+   .data   = _mmap_noexec_taint,
+   .maxlen = sizeof(sysctl_mmap_noexec_taint),
+   .mode   = 0644,
+   .proc_handler   = proc_dointvec_minmax,
+   .extra1 = ,
+   .extra2 = ,
+   },
 #endif
 #ifdef CONFIG_NUMA
{
diff --git a/mm/Kconfig b/mm/Kconfig
index 78a23c5..08d9bc8 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -353,6 +353,23 @@ config DEFAULT_MMAP_MIN_ADDR
  This value can be changed after boot using the
  /proc/sys/vm/mmap_min_addr tunable.
 
+config MMAP_NOEXEC_TAINT
+   int "Turns on tainting of mmap()d files from noexec mountpoints"
+   depends on MMU
+   default 1
+   help
+ By default, the ability to change the protections of a virtual
+ memory area to allow execution depend on if the vma has the
+ VM_MAYEXEC flag.  When mapping regions from files, VM_MAYEXEC
+ will be unset if the containing mountpoint is mounted MNT_NOEXEC.
+ By setting the value to 0, any mmap()d region may be later
+ mprotect()d with PROT_EXEC.
+
+ If unsure, keep the value set to 1.
+
+ This value can be changed after boot using the
+ /proc/sys/vm/mmap_noexec_taint tunable.
+
 config ARCH_SUPPORTS_MEMORY_FAILURE
bool
 
diff --git a/mm/mmap.c b/mm/mmap.c
index ca9d91b..b8be093 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1246,7 +1246,8 @@ unsigned long do_mmap(struct file *file, unsigned long 
addr,
if (path_noexec(>f_path)) {
if (vm_flags & VM_EXEC)
return -EPERM;
-   vm_flags &= ~VM_MAYEXEC;
+   if (sysctl_mmap_noexec_taint)
+   vm_flags &= ~VM_MAYEXEC;
}
 
if (!file->f_op->mmap)
diff --git a/mm/util.c b/mm/util.c
index 662cddf..701f0a3 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -430,6 +430,7 @@ int sysctl_overcommit_memory __read_mostly = 
OVERCOMMIT_GUESS;
 int sysctl_overcommit_ratio __read_mostly = 50;
 unsigned long sysctl_overcommit_kbytes __read_mostly;
 int sysctl_max_ma

[PATCH v3 4/5] net: asix: see 802.3 spec for phy reset

2016-08-25 Thread robert . foss

From: Robert Foss <robert.f...@collabora.com>

From: Grant Grundler <grund...@chromium.org>

https://lkml.org/lkml/2014/11/11/947

Ben Hutchings is correct. IEEE 802.3 spec section "22.2.4.1.1 Reset" requires
up to 500ms delay. Mitigate the "max" delay by polling the phy until BCM_RESET
bit is clear.

Signed-off-by: Grant Grundler <grund...@chromium.org>
Signed-off-by: Robert Foss <robert.f...@collabora.com>
Tested-by: Robert Foss <robert.f...@collabora.com>
---
 drivers/net/usb/asix_devices.c | 27 ---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/drivers/net/usb/asix_devices.c b/drivers/net/usb/asix_devices.c
index 083dc2e..dbcdda2 100644
--- a/drivers/net/usb/asix_devices.c
+++ b/drivers/net/usb/asix_devices.c
@@ -212,6 +212,28 @@ static const struct net_device_ops ax88172_netdev_ops = {
.ndo_set_rx_mode= ax88172_set_multicast,
 };
 
+static void asix_phy_reset(struct usbnet *dev, unsigned int reset_bits)
+{
+   unsigned int timeout = 5000;
+
+   asix_mdio_write(dev->net, dev->mii.phy_id, MII_BMCR, reset_bits);
+
+   /* give phy_id a chance to process reset */
+   udelay(500);
+
+   /* See IEEE 802.3 "22.2.4.1.1 Reset": 500ms max */
+   while (timeout--) {
+   if (asix_mdio_read(dev->net, dev->mii.phy_id, MII_BMCR)
+   & BMCR_RESET)
+   udelay(100);
+   else
+   return;
+   }
+
+   netdev_err(dev->net, "BMCR_RESET timeout on phy_id %d\n",
+  dev->mii.phy_id);
+}
+
 static int ax88172_bind(struct usbnet *dev, struct usb_interface *intf)
 {
int ret = 0;
@@ -258,7 +280,7 @@ static int ax88172_bind(struct usbnet *dev, struct 
usb_interface *intf)
dev->net->needed_headroom = 4; /* cf asix_tx_fixup() */
dev->net->needed_tailroom = 4; /* cf asix_tx_fixup() */
 
-   asix_mdio_write(dev->net, dev->mii.phy_id, MII_BMCR, BMCR_RESET);
+   asix_phy_reset(dev, BMCR_RESET);
asix_mdio_write(dev->net, dev->mii.phy_id, MII_ADVERTISE,
ADVERTISE_ALL | ADVERTISE_CSMA | ADVERTISE_PAUSE_CAP);
mii_nway_restart(>mii);
@@ -900,8 +922,7 @@ static int ax88178_reset(struct usbnet *dev)
} else if (data->phymode == PHY_MODE_RTL8211CL)
rtl8211cl_phy_init(dev);
 
-   asix_mdio_write(dev->net, dev->mii.phy_id, MII_BMCR,
-   BMCR_RESET | BMCR_ANENABLE);
+   asix_phy_reset(dev, BMCR_RESET | BMCR_ANENABLE);
asix_mdio_write(dev->net, dev->mii.phy_id, MII_ADVERTISE,
ADVERTISE_ALL | ADVERTISE_CSMA | ADVERTISE_PAUSE_CAP);
asix_mdio_write(dev->net, dev->mii.phy_id, MII_CTRL1000,
-- 
git-series 0.8.10

[PATCH v3 1/5] net: asix: Add in_pm parameter

2016-08-25 Thread robert . foss

From: Robert Foss <robert.f...@collabora.com>

From: Freddy Xin <fre...@asix.com.tw>

In order to R/W registers in suspend/resume functions, in_pm flags are
added to some functions to determine whether the nopm version of usb
functions is called.

Save BMCR and ANAR PHY registers in suspend function and restore them
in resume function.

Reset HW in resume function to ensure the PHY works correctly.

Signed-off-by: Freddy Xin <fre...@asix.com.tw>
Signed-off-by: Robert Foss <robert.f...@collabora.com>
Tested-by: Robert Foss <robert.f...@collabora.com>
---
 drivers/net/usb/asix.h |  40 +++-
 drivers/net/usb/asix_common.c  | 180 
 drivers/net/usb/asix_devices.c | 373 ++
 drivers/net/usb/ax88172a.c |  29 +--
 4 files changed, 472 insertions(+), 150 deletions(-)

diff --git a/drivers/net/usb/asix.h b/drivers/net/usb/asix.h
index a2d3ea6..d109242 100644
--- a/drivers/net/usb/asix.h
+++ b/drivers/net/usb/asix.h
@@ -46,6 +46,7 @@
 #define AX_CMD_SET_SW_MII  0x06
 #define AX_CMD_READ_MII_REG0x07
 #define AX_CMD_WRITE_MII_REG   0x08
+#define AX_CMD_STATMNGSTS_REG  0x09
 #define AX_CMD_SET_HW_MII  0x0a
 #define AX_CMD_READ_EEPROM 0x0b
 #define AX_CMD_WRITE_EEPROM0x0c
@@ -71,6 +72,17 @@
 #define AX_CMD_SW_RESET0x20
 #define AX_CMD_SW_PHY_STATUS   0x21
 #define AX_CMD_SW_PHY_SELECT   0x22
+#define AX_QCTCTRL 0x2A
+
+#define AX_CHIPCODE_MASK   0x70
+#define AX_AX88772_CHIPCODE0x00
+#define AX_AX88772A_CHIPCODE   0x10
+#define AX_AX88772B_CHIPCODE   0x20
+#define AX_HOST_EN 0x01
+
+#define AX_PHYSEL_PSEL 0x01
+#define AX_PHYSEL_SSMII0
+#define AX_PHYSEL_SSEN 0x10
 
 #define AX_PHY_SELECT_MASK (BIT(3) | BIT(2))
 #define AX_PHY_SELECT_INTERNAL 0
@@ -173,6 +185,10 @@ struct asix_rx_fixup_info {
 };
 
 struct asix_common_private {
+   void (*resume)(struct usbnet *dev);
+   void (*suspend)(struct usbnet *dev);
+   u16 presvd_phy_advertise;
+   u16 presvd_phy_bmcr;
struct asix_rx_fixup_info rx_fixup_info;
 };
 
@@ -182,10 +198,10 @@ extern const struct driver_info ax88172a_info;
 #define FLAG_EEPROM_MAC(1UL << 0)  /* init device MAC from 
eeprom */
 
 int asix_read_cmd(struct usbnet *dev, u8 cmd, u16 value, u16 index,
- u16 size, void *data);
+ u16 size, void *data, int in_pm);
 
 int asix_write_cmd(struct usbnet *dev, u8 cmd, u16 value, u16 index,
-  u16 size, void *data);
+  u16 size, void *data, int in_pm);
 
 void asix_write_cmd_async(struct usbnet *dev, u8 cmd, u16 value,
  u16 index, u16 size, void *data);
@@ -197,27 +213,31 @@ int asix_rx_fixup_common(struct usbnet *dev, struct 
sk_buff *skb);
 struct sk_buff *asix_tx_fixup(struct usbnet *dev, struct sk_buff *skb,
  gfp_t flags);
 
-int asix_set_sw_mii(struct usbnet *dev);
-int asix_set_hw_mii(struct usbnet *dev);
+int asix_set_sw_mii(struct usbnet *dev, int in_pm);
+int asix_set_hw_mii(struct usbnet *dev, int in_pm);
 
 int asix_read_phy_addr(struct usbnet *dev, int internal);
 int asix_get_phy_addr(struct usbnet *dev);
 
-int asix_sw_reset(struct usbnet *dev, u8 flags);
+int asix_sw_reset(struct usbnet *dev, u8 flags, int in_pm);
 
-u16 asix_read_rx_ctl(struct usbnet *dev);
-int asix_write_rx_ctl(struct usbnet *dev, u16 mode);
+u16 asix_read_rx_ctl(struct usbnet *dev, int in_pm);
+int asix_write_rx_ctl(struct usbnet *dev, u16 mode, int in_pm);
 
-u16 asix_read_medium_status(struct usbnet *dev);
-int asix_write_medium_mode(struct usbnet *dev, u16 mode);
+u16 asix_read_medium_status(struct usbnet *dev, int in_pm);
+int asix_write_medium_mode(struct usbnet *dev, u16 mode, int in_pm);
 
-int asix_write_gpio(struct usbnet *dev, u16 value, int sleep);
+int asix_write_gpio(struct usbnet *dev, u16 value, int sleep, int in_pm);
 
 void asix_set_multicast(struct net_device *net);
 
 int asix_mdio_read(struct net_device *netdev, int phy_id, int loc);
 void asix_mdio_write(struct net_device *netdev, int phy_id, int loc, int val);
 
+int asix_mdio_read_nopm(struct net_device *netdev, int phy_id, int loc);
+void asix_mdio_write_nopm(struct net_device *netdev, int phy_id, int loc,
+ int val);
+
 void asix_get_wol(struct net_device *net, struct ethtool_wolinfo *wolinfo);
 int asix_set_wol(struct net_device *net, struct ethtool_wolinfo *wolinfo);
 
diff --git a/drivers/net/usb/asix_common.c b/drivers/net/usb/asix_common.c
index 7de5ab5..25609ee 100644
--- a/drivers/net/usb/asix_common.c
+++ b/drivers/net/usb/asix_common.c
@@ -22,24 +22,49 @@
 #include "asix.h"
 
 int asix_read_cmd(struct usbnet *dev, u8 cmd, u16 value, u16

[PATCH v3 1/5] net: asix: Add in_pm parameter

2016-08-25 Thread robert . foss

From: Freddy Xin <fre...@asix.com.tw>

In order to R/W registers in suspend/resume functions, in_pm flags are
added to some functions to determine whether the nopm version of usb
functions is called.

Save BMCR and ANAR PHY registers in suspend function and restore them
in resume function.

Reset HW in resume function to ensure the PHY works correctly.

Signed-off-by: Freddy Xin <fre...@asix.com.tw>
Signed-off-by: Robert Foss <robert.f...@collabora.com>
Tested-by: Robert Foss <robert.f...@collabora.com>
---
 drivers/net/usb/asix.h |  40 +++--
 drivers/net/usb/asix_common.c  | 180 +++-
 drivers/net/usb/asix_devices.c | 373 -
 drivers/net/usb/ax88172a.c |  29 ++--
 4 files changed, 472 insertions(+), 150 deletions(-)

diff --git a/drivers/net/usb/asix.h b/drivers/net/usb/asix.h
index a2d3ea6..d109242 100644
--- a/drivers/net/usb/asix.h
+++ b/drivers/net/usb/asix.h
@@ -46,6 +46,7 @@
 #define AX_CMD_SET_SW_MII  0x06
 #define AX_CMD_READ_MII_REG0x07
 #define AX_CMD_WRITE_MII_REG   0x08
+#define AX_CMD_STATMNGSTS_REG  0x09
 #define AX_CMD_SET_HW_MII  0x0a
 #define AX_CMD_READ_EEPROM 0x0b
 #define AX_CMD_WRITE_EEPROM0x0c
@@ -71,6 +72,17 @@
 #define AX_CMD_SW_RESET0x20
 #define AX_CMD_SW_PHY_STATUS   0x21
 #define AX_CMD_SW_PHY_SELECT   0x22
+#define AX_QCTCTRL 0x2A
+
+#define AX_CHIPCODE_MASK   0x70
+#define AX_AX88772_CHIPCODE0x00
+#define AX_AX88772A_CHIPCODE   0x10
+#define AX_AX88772B_CHIPCODE   0x20
+#define AX_HOST_EN 0x01
+
+#define AX_PHYSEL_PSEL 0x01
+#define AX_PHYSEL_SSMII0
+#define AX_PHYSEL_SSEN 0x10
 
 #define AX_PHY_SELECT_MASK (BIT(3) | BIT(2))
 #define AX_PHY_SELECT_INTERNAL 0
@@ -173,6 +185,10 @@ struct asix_rx_fixup_info {
 };
 
 struct asix_common_private {
+   void (*resume)(struct usbnet *dev);
+   void (*suspend)(struct usbnet *dev);
+   u16 presvd_phy_advertise;
+   u16 presvd_phy_bmcr;
struct asix_rx_fixup_info rx_fixup_info;
 };
 
@@ -182,10 +198,10 @@ extern const struct driver_info ax88172a_info;
 #define FLAG_EEPROM_MAC(1UL << 0)  /* init device MAC from 
eeprom */
 
 int asix_read_cmd(struct usbnet *dev, u8 cmd, u16 value, u16 index,
- u16 size, void *data);
+ u16 size, void *data, int in_pm);
 
 int asix_write_cmd(struct usbnet *dev, u8 cmd, u16 value, u16 index,
-  u16 size, void *data);
+  u16 size, void *data, int in_pm);
 
 void asix_write_cmd_async(struct usbnet *dev, u8 cmd, u16 value,
  u16 index, u16 size, void *data);
@@ -197,27 +213,31 @@ int asix_rx_fixup_common(struct usbnet *dev, struct 
sk_buff *skb);
 struct sk_buff *asix_tx_fixup(struct usbnet *dev, struct sk_buff *skb,
  gfp_t flags);
 
-int asix_set_sw_mii(struct usbnet *dev);
-int asix_set_hw_mii(struct usbnet *dev);
+int asix_set_sw_mii(struct usbnet *dev, int in_pm);
+int asix_set_hw_mii(struct usbnet *dev, int in_pm);
 
 int asix_read_phy_addr(struct usbnet *dev, int internal);
 int asix_get_phy_addr(struct usbnet *dev);
 
-int asix_sw_reset(struct usbnet *dev, u8 flags);
+int asix_sw_reset(struct usbnet *dev, u8 flags, int in_pm);
 
-u16 asix_read_rx_ctl(struct usbnet *dev);
-int asix_write_rx_ctl(struct usbnet *dev, u16 mode);
+u16 asix_read_rx_ctl(struct usbnet *dev, int in_pm);
+int asix_write_rx_ctl(struct usbnet *dev, u16 mode, int in_pm);
 
-u16 asix_read_medium_status(struct usbnet *dev);
-int asix_write_medium_mode(struct usbnet *dev, u16 mode);
+u16 asix_read_medium_status(struct usbnet *dev, int in_pm);
+int asix_write_medium_mode(struct usbnet *dev, u16 mode, int in_pm);
 
-int asix_write_gpio(struct usbnet *dev, u16 value, int sleep);
+int asix_write_gpio(struct usbnet *dev, u16 value, int sleep, int in_pm);
 
 void asix_set_multicast(struct net_device *net);
 
 int asix_mdio_read(struct net_device *netdev, int phy_id, int loc);
 void asix_mdio_write(struct net_device *netdev, int phy_id, int loc, int val);
 
+int asix_mdio_read_nopm(struct net_device *netdev, int phy_id, int loc);
+void asix_mdio_write_nopm(struct net_device *netdev, int phy_id, int loc,
+ int val);
+
 void asix_get_wol(struct net_device *net, struct ethtool_wolinfo *wolinfo);
 int asix_set_wol(struct net_device *net, struct ethtool_wolinfo *wolinfo);
 
diff --git a/drivers/net/usb/asix_common.c b/drivers/net/usb/asix_common.c
index 7de5ab5..25609ee 100644
--- a/drivers/net/usb/asix_common.c
+++ b/drivers/net/usb/asix_common.c
@@ -22,24 +22,49 @@
 #include "asix.h"
 
 int asix_read_cmd(struct usbnet *dev, u8 cmd, u16 value, u16 index,
- u16 size, void *data)
+

[PATCH v3 4/5] net: asix: see 802.3 spec for phy reset

2016-08-25 Thread robert . foss

From: Grant Grundler <grund...@chromium.org>

https://lkml.org/lkml/2014/11/11/947

Ben Hutchings is correct. IEEE 802.3 spec section "22.2.4.1.1 Reset" requires
up to 500ms delay. Mitigate the "max" delay by polling the phy until BCM_RESET
bit is clear.

Signed-off-by: Grant Grundler <grund...@chromium.org>
Signed-off-by: Robert Foss <robert.f...@collabora.com>
Tested-by: Robert Foss <robert.f...@collabora.com>
---
 drivers/net/usb/asix_devices.c | 27 ---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/drivers/net/usb/asix_devices.c b/drivers/net/usb/asix_devices.c
index 083dc2e..dbcdda2 100644
--- a/drivers/net/usb/asix_devices.c
+++ b/drivers/net/usb/asix_devices.c
@@ -212,6 +212,28 @@ static const struct net_device_ops ax88172_netdev_ops = {
.ndo_set_rx_mode= ax88172_set_multicast,
 };
 
+static void asix_phy_reset(struct usbnet *dev, unsigned int reset_bits)
+{
+   unsigned int timeout = 5000;
+
+   asix_mdio_write(dev->net, dev->mii.phy_id, MII_BMCR, reset_bits);
+
+   /* give phy_id a chance to process reset */
+   udelay(500);
+
+   /* See IEEE 802.3 "22.2.4.1.1 Reset": 500ms max */
+   while (timeout--) {
+   if (asix_mdio_read(dev->net, dev->mii.phy_id, MII_BMCR)
+   & BMCR_RESET)
+   udelay(100);
+   else
+   return;
+   }
+
+   netdev_err(dev->net, "BMCR_RESET timeout on phy_id %d\n",
+  dev->mii.phy_id);
+}
+
 static int ax88172_bind(struct usbnet *dev, struct usb_interface *intf)
 {
int ret = 0;
@@ -258,7 +280,7 @@ static int ax88172_bind(struct usbnet *dev, struct 
usb_interface *intf)
dev->net->needed_headroom = 4; /* cf asix_tx_fixup() */
dev->net->needed_tailroom = 4; /* cf asix_tx_fixup() */
 
-   asix_mdio_write(dev->net, dev->mii.phy_id, MII_BMCR, BMCR_RESET);
+   asix_phy_reset(dev, BMCR_RESET);
asix_mdio_write(dev->net, dev->mii.phy_id, MII_ADVERTISE,
ADVERTISE_ALL | ADVERTISE_CSMA | ADVERTISE_PAUSE_CAP);
mii_nway_restart(>mii);
@@ -900,8 +922,7 @@ static int ax88178_reset(struct usbnet *dev)
} else if (data->phymode == PHY_MODE_RTL8211CL)
rtl8211cl_phy_init(dev);
 
-   asix_mdio_write(dev->net, dev->mii.phy_id, MII_BMCR,
-   BMCR_RESET | BMCR_ANENABLE);
+   asix_phy_reset(dev, BMCR_RESET | BMCR_ANENABLE);
asix_mdio_write(dev->net, dev->mii.phy_id, MII_ADVERTISE,
ADVERTISE_ALL | ADVERTISE_CSMA | ADVERTISE_PAUSE_CAP);
asix_mdio_write(dev->net, dev->mii.phy_id, MII_CTRL1000,
-- 
2.7.4

[PATCH v3 2/5] net: asix: Avoid looping when the device is disconnected

2016-08-25 Thread robert . foss

From: Robert Foss <robert.f...@collabora.com>

From: Vincent Palatin <vpala...@chromium.org>

Check the answers from the USB stack and avoid re-sending multiple times
the request if the device has disappeared.

Signed-off-by: Vincent Palatin <vpala...@chromium.org>
Signed-off-by: Robert Foss <robert.f...@collabora.com>
Tested-by: Robert Foss <robert.f...@collabora.com>
---
 drivers/net/usb/asix_common.c  | 56 +++
 drivers/net/usb/asix_devices.c |  2 +-
 2 files changed, 46 insertions(+), 12 deletions(-)

diff --git a/drivers/net/usb/asix_common.c b/drivers/net/usb/asix_common.c
index 25609ee..f79eb12 100644
--- a/drivers/net/usb/asix_common.c
+++ b/drivers/net/usb/asix_common.c
@@ -428,13 +428,21 @@ int asix_mdio_read(struct net_device *netdev, int phy_id, 
int loc)
__le16 res;
u8 smsr;
int i = 0;
+   int ret;
 
mutex_lock(>phy_mutex);
do {
-   asix_set_sw_mii(dev, 0);
+   ret = asix_set_sw_mii(dev, 0);
+   if (ret == -ENODEV)
+   break;
usleep_range(1000, 1100);
-   asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG, 0, 0, 1, , 0);
-   } while (!(smsr & AX_HOST_EN) && (i++ < 30));
+   ret = asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG,
+   0, 0, 1, , 0);
+   } while (!(smsr & AX_HOST_EN) && (i++ < 30) && (ret != -ENODEV));
+   if (ret == -ENODEV) {
+   mutex_unlock(>phy_mutex);
+   return ret;
+   }
 
asix_read_cmd(dev, AX_CMD_READ_MII_REG, phy_id,
(__u16)loc, 2, , 0);
@@ -453,16 +461,24 @@ void asix_mdio_write(struct net_device *netdev, int 
phy_id, int loc, int val)
__le16 res = cpu_to_le16(val);
u8 smsr;
int i = 0;
+   int ret;
 
netdev_dbg(dev->net, "asix_mdio_write() phy_id=0x%02x, loc=0x%02x, 
val=0x%04x\n",
phy_id, loc, val);
 
mutex_lock(>phy_mutex);
do {
-   asix_set_sw_mii(dev, 0);
+   ret = asix_set_sw_mii(dev, 0);
+   if (ret == -ENODEV)
+   break;
usleep_range(1000, 1100);
-   asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG, 0, 0, 1, , 0);
-   } while (!(smsr & AX_HOST_EN) && (i++ < 30));
+   ret = asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG,
+   0, 0, 1, , 0);
+   } while (!(smsr & AX_HOST_EN) && (i++ < 30) && (ret != -ENODEV));
+   if (ret == -ENODEV) {
+   mutex_unlock(>phy_mutex);
+   return;
+   }
 
asix_write_cmd(dev, AX_CMD_WRITE_MII_REG, phy_id,
   (__u16)loc, 2, , 0);
@@ -476,13 +492,21 @@ int asix_mdio_read_nopm(struct net_device *netdev, int 
phy_id, int loc)
__le16 res;
u8 smsr;
int i = 0;
+   int ret;
 
mutex_lock(>phy_mutex);
do {
-   asix_set_sw_mii(dev, 1);
+   ret = asix_set_sw_mii(dev, 1);
+   if (ret == -ENODEV)
+   break;
usleep_range(1000, 1100);
-   asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG, 0, 0, 1, , 1);
-   } while (!(smsr & AX_HOST_EN) && (i++ < 30));
+   ret = asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG,
+   0, 0, 1, , 1);
+   } while (!(smsr & AX_HOST_EN) && (i++ < 30) && (ret != -ENODEV));
+   if (ret == -ENODEV) {
+   mutex_unlock(>phy_mutex);
+   return ret;
+   }
 
asix_read_cmd(dev, AX_CMD_READ_MII_REG, phy_id,
  (__u16)loc, 2, , 1);
@@ -502,16 +526,24 @@ asix_mdio_write_nopm(struct net_device *netdev, int 
phy_id, int loc, int val)
__le16 res = cpu_to_le16(val);
u8 smsr;
int i = 0;
+   int ret;
 
netdev_dbg(dev->net, "asix_mdio_write() phy_id=0x%02x, loc=0x%02x, 
val=0x%04x\n",
phy_id, loc, val);
 
mutex_lock(>phy_mutex);
do {
-   asix_set_sw_mii(dev, 1);
+   ret = asix_set_sw_mii(dev, 1);
+   if (ret == -ENODEV)
+   break;
usleep_range(1000, 1100);
-   asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG, 0, 0, 1, , 1);
-   } while (!(smsr & AX_HOST_EN) && (i++ < 30));
+   ret = asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG,
+   0, 0, 1, , 1);
+   } while (!(smsr & AX_HOST_EN) && (i++ < 30) && (ret != -ENODEV));
+   if (ret == -ENODEV) {
+   mutex_unlock(>phy_mutex);
+   return;
+   }
 
asix_write_cmd(dev, AX_CMD_WRITE_MII_REG, phy_id,

[PATCH v3 5/5] net: asix: autoneg will set WRITE_MEDIUM reg

2016-08-25 Thread robert . foss

From: Robert Foss <robert.f...@collabora.com>

From: Grant Grundler <grund...@chromium.org>

The miii_nway_restart() causes a PHY link change activity and
ax88772_link_reset will be called. link_reset will set
AX_CMD_WRITE_MEDIUM_MODE register correctly.

The asix_write_medium_mode in reset() fills in a default value to the register
which may be different from the negotiation result. So do this first.

Ignore the ret value since it's ignored in XXX_link_reset() functions.

Signed-off-by: Grant Grundler <grund...@google.com>
Signed-off-by: Robert Foss <robert.f...@collabora.com>
Tested-by: Robert Foss <robert.f...@collabora.com>
---
 drivers/net/usb/asix_devices.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/net/usb/asix_devices.c b/drivers/net/usb/asix_devices.c
index dbcdda2..cce2495 100644
--- a/drivers/net/usb/asix_devices.c
+++ b/drivers/net/usb/asix_devices.c
@@ -928,12 +928,9 @@ static int ax88178_reset(struct usbnet *dev)
asix_mdio_write(dev->net, dev->mii.phy_id, MII_CTRL1000,
ADVERTISE_1000FULL);
 
+   asix_write_medium_mode(dev, AX88178_MEDIUM_DEFAULT, 0);
mii_nway_restart(>mii);
 
-   ret = asix_write_medium_mode(dev, AX88178_MEDIUM_DEFAULT, 0);
-   if (ret < 0)
-   return ret;
-
/* Rewrite MAC address */
memcpy(data->mac_addr, dev->net->dev_addr, ETH_ALEN);
ret = asix_write_cmd(dev, AX_CMD_WRITE_NODE_ID, 0, 0, ETH_ALEN,
-- 
git-series 0.8.10

[PATCH v3 3/5] net: asix: Fix AX88772x resume failures

2016-08-25 Thread robert . foss

From: Allan Chou <al...@asix.com.tw>

The change fixes AX88772x resume failure by
- Restore incorrect AX88772A PHY registers when resetting
- Need to stop MAC operation when suspending
- Need to restart MII when restoring PHY

Signed-off-by: Allan Chou <al...@asix.com.tw>
Signed-off-by: Robert Foss <robert.f...@collabora.com>
Tested-by: Robert Foss <robert.f...@collabora.com>
---
 drivers/net/usb/asix_devices.c | 47 +-
 1 file changed, 46 insertions(+), 1 deletion(-)

diff --git a/drivers/net/usb/asix_devices.c b/drivers/net/usb/asix_devices.c
index ebeb730..083dc2e 100644
--- a/drivers/net/usb/asix_devices.c
+++ b/drivers/net/usb/asix_devices.c
@@ -35,6 +35,15 @@
 
 #definePHY_MODE_RTL8211CL  0x000C
 
+#define AX88772A_PHY14H0x14
+#define AX88772A_PHY14H_DEFAULT 0x442C
+
+#define AX88772A_PHY15H0x15
+#define AX88772A_PHY15H_DEFAULT 0x03C8
+
+#define AX88772A_PHY16H0x16
+#define AX88772A_PHY16H_DEFAULT 0x4044
+
 struct ax88172_int_data {
__le16 res1;
u8 link;
@@ -424,7 +433,7 @@ static int ax88772a_hw_reset(struct usbnet *dev, int in_pm)
 {
struct asix_data *data = (struct asix_data *)>data;
int ret, embd_phy;
-   u16 rx_ctl;
+   u16 rx_ctl, phy14h, phy15h, phy16h;
u8 chipcode = 0;
 
ret = asix_write_gpio(dev, AX_GPIO_RSE, 5, in_pm);
@@ -482,6 +491,32 @@ static int ax88772a_hw_reset(struct usbnet *dev, int in_pm)
   ret);
goto out;
}
+   } else if ((chipcode & AX_CHIPCODE_MASK) == AX_AX88772A_CHIPCODE) {
+   /* Check if the PHY registers have default settings */
+   phy14h = asix_mdio_read_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY14H);
+   phy15h = asix_mdio_read_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY15H);
+   phy16h = asix_mdio_read_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY16H);
+
+   netdev_dbg(dev->net,
+  "772a_hw_reset: MR20=0x%x MR21=0x%x MR22=0x%x\n",
+  phy14h, phy15h, phy16h);
+
+   /* Restore PHY registers default setting if not */
+   if (phy14h != AX88772A_PHY14H_DEFAULT)
+   asix_mdio_write_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY14H,
+AX88772A_PHY14H_DEFAULT);
+   if (phy15h != AX88772A_PHY15H_DEFAULT)
+   asix_mdio_write_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY15H,
+AX88772A_PHY15H_DEFAULT);
+   if (phy16h != AX88772A_PHY16H_DEFAULT)
+   asix_mdio_write_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY16H,
+AX88772A_PHY16H_DEFAULT);
}
 
ret = asix_write_cmd(dev, AX_CMD_WRITE_IPG0,
@@ -543,6 +578,15 @@ static const struct net_device_ops ax88772_netdev_ops = {
 static void ax88772_suspend(struct usbnet *dev)
 {
struct asix_common_private *priv = dev->driver_priv;
+   u16 medium;
+
+   /* Stop MAC operation */
+   medium = asix_read_medium_status(dev, 0);
+   medium &= ~AX_MEDIUM_RE;
+   asix_write_medium_mode(dev, medium, 0);
+
+   netdev_dbg(dev->net, "ax88772_suspend: medium=0x%04x\n",
+  asix_read_medium_status(dev, 0));
 
/* Preserve BMCR for restoring */
priv->presvd_phy_bmcr =
@@ -577,6 +621,7 @@ static void ax88772_restore_phy(struct usbnet *dev)
asix_mdio_write_nopm(dev->net, dev->mii.phy_id, MII_BMCR,
 priv->presvd_phy_bmcr);
 
+   mii_nway_restart(>mii);
priv->presvd_phy_advertise = 0;
priv->presvd_phy_bmcr = 0;
}
-- 
2.7.4

[PATCH v3 3/5] net: asix: Fix AX88772x resume failures

2016-08-25 Thread robert . foss

From: Robert Foss <robert.f...@collabora.com>

From: Allan Chou <al...@asix.com.tw>

The change fixes AX88772x resume failure by
- Restore incorrect AX88772A PHY registers when resetting
- Need to stop MAC operation when suspending
- Need to restart MII when restoring PHY

Signed-off-by: Allan Chou <al...@asix.com.tw>
Signed-off-by: Robert Foss <robert.f...@collabora.com>
Tested-by: Robert Foss <robert.f...@collabora.com>
---
 drivers/net/usb/asix_devices.c | 47 ++-
 1 file changed, 46 insertions(+), 1 deletion(-)

diff --git a/drivers/net/usb/asix_devices.c b/drivers/net/usb/asix_devices.c
index ebeb730..083dc2e 100644
--- a/drivers/net/usb/asix_devices.c
+++ b/drivers/net/usb/asix_devices.c
@@ -35,6 +35,15 @@
 
 #definePHY_MODE_RTL8211CL  0x000C
 
+#define AX88772A_PHY14H0x14
+#define AX88772A_PHY14H_DEFAULT 0x442C
+
+#define AX88772A_PHY15H0x15
+#define AX88772A_PHY15H_DEFAULT 0x03C8
+
+#define AX88772A_PHY16H0x16
+#define AX88772A_PHY16H_DEFAULT 0x4044
+
 struct ax88172_int_data {
__le16 res1;
u8 link;
@@ -424,7 +433,7 @@ static int ax88772a_hw_reset(struct usbnet *dev, int in_pm)
 {
struct asix_data *data = (struct asix_data *)>data;
int ret, embd_phy;
-   u16 rx_ctl;
+   u16 rx_ctl, phy14h, phy15h, phy16h;
u8 chipcode = 0;
 
ret = asix_write_gpio(dev, AX_GPIO_RSE, 5, in_pm);
@@ -482,6 +491,32 @@ static int ax88772a_hw_reset(struct usbnet *dev, int in_pm)
   ret);
goto out;
}
+   } else if ((chipcode & AX_CHIPCODE_MASK) == AX_AX88772A_CHIPCODE) {
+   /* Check if the PHY registers have default settings */
+   phy14h = asix_mdio_read_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY14H);
+   phy15h = asix_mdio_read_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY15H);
+   phy16h = asix_mdio_read_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY16H);
+
+   netdev_dbg(dev->net,
+  "772a_hw_reset: MR20=0x%x MR21=0x%x MR22=0x%x\n",
+  phy14h, phy15h, phy16h);
+
+   /* Restore PHY registers default setting if not */
+   if (phy14h != AX88772A_PHY14H_DEFAULT)
+   asix_mdio_write_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY14H,
+AX88772A_PHY14H_DEFAULT);
+   if (phy15h != AX88772A_PHY15H_DEFAULT)
+   asix_mdio_write_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY15H,
+AX88772A_PHY15H_DEFAULT);
+   if (phy16h != AX88772A_PHY16H_DEFAULT)
+   asix_mdio_write_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY16H,
+AX88772A_PHY16H_DEFAULT);
}
 
ret = asix_write_cmd(dev, AX_CMD_WRITE_IPG0,
@@ -543,6 +578,15 @@ static const struct net_device_ops ax88772_netdev_ops = {
 static void ax88772_suspend(struct usbnet *dev)
 {
struct asix_common_private *priv = dev->driver_priv;
+   u16 medium;
+
+   /* Stop MAC operation */
+   medium = asix_read_medium_status(dev, 0);
+   medium &= ~AX_MEDIUM_RE;
+   asix_write_medium_mode(dev, medium, 0);
+
+   netdev_dbg(dev->net, "ax88772_suspend: medium=0x%04x\n",
+  asix_read_medium_status(dev, 0));
 
/* Preserve BMCR for restoring */
priv->presvd_phy_bmcr =
@@ -577,6 +621,7 @@ static void ax88772_restore_phy(struct usbnet *dev)
asix_mdio_write_nopm(dev->net, dev->mii.phy_id, MII_BMCR,
 priv->presvd_phy_bmcr);
 
+   mii_nway_restart(>mii);
priv->presvd_phy_advertise = 0;
priv->presvd_phy_bmcr = 0;
}
-- 
git-series 0.8.10

[PATCH v3 4/5] net: asix: see 802.3 spec for phy reset

2016-08-25 Thread robert . foss

From: Robert Foss 

From: Grant Grundler 

https://lkml.org/lkml/2014/11/11/947

Ben Hutchings is correct. IEEE 802.3 spec section "22.2.4.1.1 Reset" requires
up to 500ms delay. Mitigate the "max" delay by polling the phy until BCM_RESET
bit is clear.

Signed-off-by: Grant Grundler 
Signed-off-by: Robert Foss 
Tested-by: Robert Foss 
---
 drivers/net/usb/asix_devices.c | 27 ---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/drivers/net/usb/asix_devices.c b/drivers/net/usb/asix_devices.c
index 083dc2e..dbcdda2 100644
--- a/drivers/net/usb/asix_devices.c
+++ b/drivers/net/usb/asix_devices.c
@@ -212,6 +212,28 @@ static const struct net_device_ops ax88172_netdev_ops = {
.ndo_set_rx_mode= ax88172_set_multicast,
 };
 
+static void asix_phy_reset(struct usbnet *dev, unsigned int reset_bits)
+{
+   unsigned int timeout = 5000;
+
+   asix_mdio_write(dev->net, dev->mii.phy_id, MII_BMCR, reset_bits);
+
+   /* give phy_id a chance to process reset */
+   udelay(500);
+
+   /* See IEEE 802.3 "22.2.4.1.1 Reset": 500ms max */
+   while (timeout--) {
+   if (asix_mdio_read(dev->net, dev->mii.phy_id, MII_BMCR)
+   & BMCR_RESET)
+   udelay(100);
+   else
+   return;
+   }
+
+   netdev_err(dev->net, "BMCR_RESET timeout on phy_id %d\n",
+  dev->mii.phy_id);
+}
+
 static int ax88172_bind(struct usbnet *dev, struct usb_interface *intf)
 {
int ret = 0;
@@ -258,7 +280,7 @@ static int ax88172_bind(struct usbnet *dev, struct 
usb_interface *intf)
dev->net->needed_headroom = 4; /* cf asix_tx_fixup() */
dev->net->needed_tailroom = 4; /* cf asix_tx_fixup() */
 
-   asix_mdio_write(dev->net, dev->mii.phy_id, MII_BMCR, BMCR_RESET);
+   asix_phy_reset(dev, BMCR_RESET);
asix_mdio_write(dev->net, dev->mii.phy_id, MII_ADVERTISE,
ADVERTISE_ALL | ADVERTISE_CSMA | ADVERTISE_PAUSE_CAP);
mii_nway_restart(>mii);
@@ -900,8 +922,7 @@ static int ax88178_reset(struct usbnet *dev)
} else if (data->phymode == PHY_MODE_RTL8211CL)
rtl8211cl_phy_init(dev);
 
-   asix_mdio_write(dev->net, dev->mii.phy_id, MII_BMCR,
-   BMCR_RESET | BMCR_ANENABLE);
+   asix_phy_reset(dev, BMCR_RESET | BMCR_ANENABLE);
asix_mdio_write(dev->net, dev->mii.phy_id, MII_ADVERTISE,
ADVERTISE_ALL | ADVERTISE_CSMA | ADVERTISE_PAUSE_CAP);
asix_mdio_write(dev->net, dev->mii.phy_id, MII_CTRL1000,
-- 
git-series 0.8.10

[PATCH v3 1/5] net: asix: Add in_pm parameter

2016-08-25 Thread robert . foss

From: Robert Foss 

From: Freddy Xin 

In order to R/W registers in suspend/resume functions, in_pm flags are
added to some functions to determine whether the nopm version of usb
functions is called.

Save BMCR and ANAR PHY registers in suspend function and restore them
in resume function.

Reset HW in resume function to ensure the PHY works correctly.

Signed-off-by: Freddy Xin 
Signed-off-by: Robert Foss 
Tested-by: Robert Foss 
---
 drivers/net/usb/asix.h |  40 +++-
 drivers/net/usb/asix_common.c  | 180 
 drivers/net/usb/asix_devices.c | 373 ++
 drivers/net/usb/ax88172a.c |  29 +--
 4 files changed, 472 insertions(+), 150 deletions(-)

diff --git a/drivers/net/usb/asix.h b/drivers/net/usb/asix.h
index a2d3ea6..d109242 100644
--- a/drivers/net/usb/asix.h
+++ b/drivers/net/usb/asix.h
@@ -46,6 +46,7 @@
 #define AX_CMD_SET_SW_MII  0x06
 #define AX_CMD_READ_MII_REG0x07
 #define AX_CMD_WRITE_MII_REG   0x08
+#define AX_CMD_STATMNGSTS_REG  0x09
 #define AX_CMD_SET_HW_MII  0x0a
 #define AX_CMD_READ_EEPROM 0x0b
 #define AX_CMD_WRITE_EEPROM0x0c
@@ -71,6 +72,17 @@
 #define AX_CMD_SW_RESET0x20
 #define AX_CMD_SW_PHY_STATUS   0x21
 #define AX_CMD_SW_PHY_SELECT   0x22
+#define AX_QCTCTRL 0x2A
+
+#define AX_CHIPCODE_MASK   0x70
+#define AX_AX88772_CHIPCODE0x00
+#define AX_AX88772A_CHIPCODE   0x10
+#define AX_AX88772B_CHIPCODE   0x20
+#define AX_HOST_EN 0x01
+
+#define AX_PHYSEL_PSEL 0x01
+#define AX_PHYSEL_SSMII0
+#define AX_PHYSEL_SSEN 0x10
 
 #define AX_PHY_SELECT_MASK (BIT(3) | BIT(2))
 #define AX_PHY_SELECT_INTERNAL 0
@@ -173,6 +185,10 @@ struct asix_rx_fixup_info {
 };
 
 struct asix_common_private {
+   void (*resume)(struct usbnet *dev);
+   void (*suspend)(struct usbnet *dev);
+   u16 presvd_phy_advertise;
+   u16 presvd_phy_bmcr;
struct asix_rx_fixup_info rx_fixup_info;
 };
 
@@ -182,10 +198,10 @@ extern const struct driver_info ax88172a_info;
 #define FLAG_EEPROM_MAC(1UL << 0)  /* init device MAC from 
eeprom */
 
 int asix_read_cmd(struct usbnet *dev, u8 cmd, u16 value, u16 index,
- u16 size, void *data);
+ u16 size, void *data, int in_pm);
 
 int asix_write_cmd(struct usbnet *dev, u8 cmd, u16 value, u16 index,
-  u16 size, void *data);
+  u16 size, void *data, int in_pm);
 
 void asix_write_cmd_async(struct usbnet *dev, u8 cmd, u16 value,
  u16 index, u16 size, void *data);
@@ -197,27 +213,31 @@ int asix_rx_fixup_common(struct usbnet *dev, struct 
sk_buff *skb);
 struct sk_buff *asix_tx_fixup(struct usbnet *dev, struct sk_buff *skb,
  gfp_t flags);
 
-int asix_set_sw_mii(struct usbnet *dev);
-int asix_set_hw_mii(struct usbnet *dev);
+int asix_set_sw_mii(struct usbnet *dev, int in_pm);
+int asix_set_hw_mii(struct usbnet *dev, int in_pm);
 
 int asix_read_phy_addr(struct usbnet *dev, int internal);
 int asix_get_phy_addr(struct usbnet *dev);
 
-int asix_sw_reset(struct usbnet *dev, u8 flags);
+int asix_sw_reset(struct usbnet *dev, u8 flags, int in_pm);
 
-u16 asix_read_rx_ctl(struct usbnet *dev);
-int asix_write_rx_ctl(struct usbnet *dev, u16 mode);
+u16 asix_read_rx_ctl(struct usbnet *dev, int in_pm);
+int asix_write_rx_ctl(struct usbnet *dev, u16 mode, int in_pm);
 
-u16 asix_read_medium_status(struct usbnet *dev);
-int asix_write_medium_mode(struct usbnet *dev, u16 mode);
+u16 asix_read_medium_status(struct usbnet *dev, int in_pm);
+int asix_write_medium_mode(struct usbnet *dev, u16 mode, int in_pm);
 
-int asix_write_gpio(struct usbnet *dev, u16 value, int sleep);
+int asix_write_gpio(struct usbnet *dev, u16 value, int sleep, int in_pm);
 
 void asix_set_multicast(struct net_device *net);
 
 int asix_mdio_read(struct net_device *netdev, int phy_id, int loc);
 void asix_mdio_write(struct net_device *netdev, int phy_id, int loc, int val);
 
+int asix_mdio_read_nopm(struct net_device *netdev, int phy_id, int loc);
+void asix_mdio_write_nopm(struct net_device *netdev, int phy_id, int loc,
+ int val);
+
 void asix_get_wol(struct net_device *net, struct ethtool_wolinfo *wolinfo);
 int asix_set_wol(struct net_device *net, struct ethtool_wolinfo *wolinfo);
 
diff --git a/drivers/net/usb/asix_common.c b/drivers/net/usb/asix_common.c
index 7de5ab5..25609ee 100644
--- a/drivers/net/usb/asix_common.c
+++ b/drivers/net/usb/asix_common.c
@@ -22,24 +22,49 @@
 #include "asix.h"
 
 int asix_read_cmd(struct usbnet *dev, u8 cmd, u16 value, u16 index,
- u16 size, void *data)
+ u16 size, void *data, int in_pm)
 {
int ret;
-   ret = usbnet_r

[PATCH v3 1/5] net: asix: Add in_pm parameter

2016-08-25 Thread robert . foss

From: Freddy Xin 

In order to R/W registers in suspend/resume functions, in_pm flags are
added to some functions to determine whether the nopm version of usb
functions is called.

Save BMCR and ANAR PHY registers in suspend function and restore them
in resume function.

Reset HW in resume function to ensure the PHY works correctly.

Signed-off-by: Freddy Xin 
Signed-off-by: Robert Foss 
Tested-by: Robert Foss 
---
 drivers/net/usb/asix.h |  40 +++--
 drivers/net/usb/asix_common.c  | 180 +++-
 drivers/net/usb/asix_devices.c | 373 -
 drivers/net/usb/ax88172a.c |  29 ++--
 4 files changed, 472 insertions(+), 150 deletions(-)

diff --git a/drivers/net/usb/asix.h b/drivers/net/usb/asix.h
index a2d3ea6..d109242 100644
--- a/drivers/net/usb/asix.h
+++ b/drivers/net/usb/asix.h
@@ -46,6 +46,7 @@
 #define AX_CMD_SET_SW_MII  0x06
 #define AX_CMD_READ_MII_REG0x07
 #define AX_CMD_WRITE_MII_REG   0x08
+#define AX_CMD_STATMNGSTS_REG  0x09
 #define AX_CMD_SET_HW_MII  0x0a
 #define AX_CMD_READ_EEPROM 0x0b
 #define AX_CMD_WRITE_EEPROM0x0c
@@ -71,6 +72,17 @@
 #define AX_CMD_SW_RESET0x20
 #define AX_CMD_SW_PHY_STATUS   0x21
 #define AX_CMD_SW_PHY_SELECT   0x22
+#define AX_QCTCTRL 0x2A
+
+#define AX_CHIPCODE_MASK   0x70
+#define AX_AX88772_CHIPCODE0x00
+#define AX_AX88772A_CHIPCODE   0x10
+#define AX_AX88772B_CHIPCODE   0x20
+#define AX_HOST_EN 0x01
+
+#define AX_PHYSEL_PSEL 0x01
+#define AX_PHYSEL_SSMII0
+#define AX_PHYSEL_SSEN 0x10
 
 #define AX_PHY_SELECT_MASK (BIT(3) | BIT(2))
 #define AX_PHY_SELECT_INTERNAL 0
@@ -173,6 +185,10 @@ struct asix_rx_fixup_info {
 };
 
 struct asix_common_private {
+   void (*resume)(struct usbnet *dev);
+   void (*suspend)(struct usbnet *dev);
+   u16 presvd_phy_advertise;
+   u16 presvd_phy_bmcr;
struct asix_rx_fixup_info rx_fixup_info;
 };
 
@@ -182,10 +198,10 @@ extern const struct driver_info ax88172a_info;
 #define FLAG_EEPROM_MAC(1UL << 0)  /* init device MAC from 
eeprom */
 
 int asix_read_cmd(struct usbnet *dev, u8 cmd, u16 value, u16 index,
- u16 size, void *data);
+ u16 size, void *data, int in_pm);
 
 int asix_write_cmd(struct usbnet *dev, u8 cmd, u16 value, u16 index,
-  u16 size, void *data);
+  u16 size, void *data, int in_pm);
 
 void asix_write_cmd_async(struct usbnet *dev, u8 cmd, u16 value,
  u16 index, u16 size, void *data);
@@ -197,27 +213,31 @@ int asix_rx_fixup_common(struct usbnet *dev, struct 
sk_buff *skb);
 struct sk_buff *asix_tx_fixup(struct usbnet *dev, struct sk_buff *skb,
  gfp_t flags);
 
-int asix_set_sw_mii(struct usbnet *dev);
-int asix_set_hw_mii(struct usbnet *dev);
+int asix_set_sw_mii(struct usbnet *dev, int in_pm);
+int asix_set_hw_mii(struct usbnet *dev, int in_pm);
 
 int asix_read_phy_addr(struct usbnet *dev, int internal);
 int asix_get_phy_addr(struct usbnet *dev);
 
-int asix_sw_reset(struct usbnet *dev, u8 flags);
+int asix_sw_reset(struct usbnet *dev, u8 flags, int in_pm);
 
-u16 asix_read_rx_ctl(struct usbnet *dev);
-int asix_write_rx_ctl(struct usbnet *dev, u16 mode);
+u16 asix_read_rx_ctl(struct usbnet *dev, int in_pm);
+int asix_write_rx_ctl(struct usbnet *dev, u16 mode, int in_pm);
 
-u16 asix_read_medium_status(struct usbnet *dev);
-int asix_write_medium_mode(struct usbnet *dev, u16 mode);
+u16 asix_read_medium_status(struct usbnet *dev, int in_pm);
+int asix_write_medium_mode(struct usbnet *dev, u16 mode, int in_pm);
 
-int asix_write_gpio(struct usbnet *dev, u16 value, int sleep);
+int asix_write_gpio(struct usbnet *dev, u16 value, int sleep, int in_pm);
 
 void asix_set_multicast(struct net_device *net);
 
 int asix_mdio_read(struct net_device *netdev, int phy_id, int loc);
 void asix_mdio_write(struct net_device *netdev, int phy_id, int loc, int val);
 
+int asix_mdio_read_nopm(struct net_device *netdev, int phy_id, int loc);
+void asix_mdio_write_nopm(struct net_device *netdev, int phy_id, int loc,
+ int val);
+
 void asix_get_wol(struct net_device *net, struct ethtool_wolinfo *wolinfo);
 int asix_set_wol(struct net_device *net, struct ethtool_wolinfo *wolinfo);
 
diff --git a/drivers/net/usb/asix_common.c b/drivers/net/usb/asix_common.c
index 7de5ab5..25609ee 100644
--- a/drivers/net/usb/asix_common.c
+++ b/drivers/net/usb/asix_common.c
@@ -22,24 +22,49 @@
 #include "asix.h"
 
 int asix_read_cmd(struct usbnet *dev, u8 cmd, u16 value, u16 index,
- u16 size, void *data)
+ u16 size, void *data, int in_pm)
 {
int ret;
-   ret = usbnet_r

[PATCH v3 5/5] net: asix: autoneg will set WRITE_MEDIUM reg

2016-08-25 Thread robert . foss

From: Robert Foss 

From: Grant Grundler 

The miii_nway_restart() causes a PHY link change activity and
ax88772_link_reset will be called. link_reset will set
AX_CMD_WRITE_MEDIUM_MODE register correctly.

The asix_write_medium_mode in reset() fills in a default value to the register
which may be different from the negotiation result. So do this first.

Ignore the ret value since it's ignored in XXX_link_reset() functions.

Signed-off-by: Grant Grundler 
Signed-off-by: Robert Foss 
Tested-by: Robert Foss 
---
 drivers/net/usb/asix_devices.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/net/usb/asix_devices.c b/drivers/net/usb/asix_devices.c
index dbcdda2..cce2495 100644
--- a/drivers/net/usb/asix_devices.c
+++ b/drivers/net/usb/asix_devices.c
@@ -928,12 +928,9 @@ static int ax88178_reset(struct usbnet *dev)
asix_mdio_write(dev->net, dev->mii.phy_id, MII_CTRL1000,
ADVERTISE_1000FULL);
 
+   asix_write_medium_mode(dev, AX88178_MEDIUM_DEFAULT, 0);
mii_nway_restart(>mii);
 
-   ret = asix_write_medium_mode(dev, AX88178_MEDIUM_DEFAULT, 0);
-   if (ret < 0)
-   return ret;
-
/* Rewrite MAC address */
memcpy(data->mac_addr, dev->net->dev_addr, ETH_ALEN);
ret = asix_write_cmd(dev, AX_CMD_WRITE_NODE_ID, 0, 0, ETH_ALEN,
-- 
git-series 0.8.10

[PATCH v3 3/5] net: asix: Fix AX88772x resume failures

2016-08-25 Thread robert . foss

From: Allan Chou 

The change fixes AX88772x resume failure by
- Restore incorrect AX88772A PHY registers when resetting
- Need to stop MAC operation when suspending
- Need to restart MII when restoring PHY

Signed-off-by: Allan Chou 
Signed-off-by: Robert Foss 
Tested-by: Robert Foss 
---
 drivers/net/usb/asix_devices.c | 47 +-
 1 file changed, 46 insertions(+), 1 deletion(-)

diff --git a/drivers/net/usb/asix_devices.c b/drivers/net/usb/asix_devices.c
index ebeb730..083dc2e 100644
--- a/drivers/net/usb/asix_devices.c
+++ b/drivers/net/usb/asix_devices.c
@@ -35,6 +35,15 @@
 
 #definePHY_MODE_RTL8211CL  0x000C
 
+#define AX88772A_PHY14H0x14
+#define AX88772A_PHY14H_DEFAULT 0x442C
+
+#define AX88772A_PHY15H0x15
+#define AX88772A_PHY15H_DEFAULT 0x03C8
+
+#define AX88772A_PHY16H0x16
+#define AX88772A_PHY16H_DEFAULT 0x4044
+
 struct ax88172_int_data {
__le16 res1;
u8 link;
@@ -424,7 +433,7 @@ static int ax88772a_hw_reset(struct usbnet *dev, int in_pm)
 {
struct asix_data *data = (struct asix_data *)>data;
int ret, embd_phy;
-   u16 rx_ctl;
+   u16 rx_ctl, phy14h, phy15h, phy16h;
u8 chipcode = 0;
 
ret = asix_write_gpio(dev, AX_GPIO_RSE, 5, in_pm);
@@ -482,6 +491,32 @@ static int ax88772a_hw_reset(struct usbnet *dev, int in_pm)
   ret);
goto out;
}
+   } else if ((chipcode & AX_CHIPCODE_MASK) == AX_AX88772A_CHIPCODE) {
+   /* Check if the PHY registers have default settings */
+   phy14h = asix_mdio_read_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY14H);
+   phy15h = asix_mdio_read_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY15H);
+   phy16h = asix_mdio_read_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY16H);
+
+   netdev_dbg(dev->net,
+  "772a_hw_reset: MR20=0x%x MR21=0x%x MR22=0x%x\n",
+  phy14h, phy15h, phy16h);
+
+   /* Restore PHY registers default setting if not */
+   if (phy14h != AX88772A_PHY14H_DEFAULT)
+   asix_mdio_write_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY14H,
+AX88772A_PHY14H_DEFAULT);
+   if (phy15h != AX88772A_PHY15H_DEFAULT)
+   asix_mdio_write_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY15H,
+AX88772A_PHY15H_DEFAULT);
+   if (phy16h != AX88772A_PHY16H_DEFAULT)
+   asix_mdio_write_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY16H,
+AX88772A_PHY16H_DEFAULT);
}
 
ret = asix_write_cmd(dev, AX_CMD_WRITE_IPG0,
@@ -543,6 +578,15 @@ static const struct net_device_ops ax88772_netdev_ops = {
 static void ax88772_suspend(struct usbnet *dev)
 {
struct asix_common_private *priv = dev->driver_priv;
+   u16 medium;
+
+   /* Stop MAC operation */
+   medium = asix_read_medium_status(dev, 0);
+   medium &= ~AX_MEDIUM_RE;
+   asix_write_medium_mode(dev, medium, 0);
+
+   netdev_dbg(dev->net, "ax88772_suspend: medium=0x%04x\n",
+  asix_read_medium_status(dev, 0));
 
/* Preserve BMCR for restoring */
priv->presvd_phy_bmcr =
@@ -577,6 +621,7 @@ static void ax88772_restore_phy(struct usbnet *dev)
asix_mdio_write_nopm(dev->net, dev->mii.phy_id, MII_BMCR,
 priv->presvd_phy_bmcr);
 
+   mii_nway_restart(>mii);
priv->presvd_phy_advertise = 0;
priv->presvd_phy_bmcr = 0;
}
-- 
2.7.4

[PATCH v3 4/5] net: asix: see 802.3 spec for phy reset

2016-08-25 Thread robert . foss

From: Grant Grundler 

https://lkml.org/lkml/2014/11/11/947

Ben Hutchings is correct. IEEE 802.3 spec section "22.2.4.1.1 Reset" requires
up to 500ms delay. Mitigate the "max" delay by polling the phy until BCM_RESET
bit is clear.

Signed-off-by: Grant Grundler 
Signed-off-by: Robert Foss 
Tested-by: Robert Foss 
---
 drivers/net/usb/asix_devices.c | 27 ---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/drivers/net/usb/asix_devices.c b/drivers/net/usb/asix_devices.c
index 083dc2e..dbcdda2 100644
--- a/drivers/net/usb/asix_devices.c
+++ b/drivers/net/usb/asix_devices.c
@@ -212,6 +212,28 @@ static const struct net_device_ops ax88172_netdev_ops = {
.ndo_set_rx_mode= ax88172_set_multicast,
 };
 
+static void asix_phy_reset(struct usbnet *dev, unsigned int reset_bits)
+{
+   unsigned int timeout = 5000;
+
+   asix_mdio_write(dev->net, dev->mii.phy_id, MII_BMCR, reset_bits);
+
+   /* give phy_id a chance to process reset */
+   udelay(500);
+
+   /* See IEEE 802.3 "22.2.4.1.1 Reset": 500ms max */
+   while (timeout--) {
+   if (asix_mdio_read(dev->net, dev->mii.phy_id, MII_BMCR)
+   & BMCR_RESET)
+   udelay(100);
+   else
+   return;
+   }
+
+   netdev_err(dev->net, "BMCR_RESET timeout on phy_id %d\n",
+  dev->mii.phy_id);
+}
+
 static int ax88172_bind(struct usbnet *dev, struct usb_interface *intf)
 {
int ret = 0;
@@ -258,7 +280,7 @@ static int ax88172_bind(struct usbnet *dev, struct 
usb_interface *intf)
dev->net->needed_headroom = 4; /* cf asix_tx_fixup() */
dev->net->needed_tailroom = 4; /* cf asix_tx_fixup() */
 
-   asix_mdio_write(dev->net, dev->mii.phy_id, MII_BMCR, BMCR_RESET);
+   asix_phy_reset(dev, BMCR_RESET);
asix_mdio_write(dev->net, dev->mii.phy_id, MII_ADVERTISE,
ADVERTISE_ALL | ADVERTISE_CSMA | ADVERTISE_PAUSE_CAP);
mii_nway_restart(>mii);
@@ -900,8 +922,7 @@ static int ax88178_reset(struct usbnet *dev)
} else if (data->phymode == PHY_MODE_RTL8211CL)
rtl8211cl_phy_init(dev);
 
-   asix_mdio_write(dev->net, dev->mii.phy_id, MII_BMCR,
-   BMCR_RESET | BMCR_ANENABLE);
+   asix_phy_reset(dev, BMCR_RESET | BMCR_ANENABLE);
asix_mdio_write(dev->net, dev->mii.phy_id, MII_ADVERTISE,
ADVERTISE_ALL | ADVERTISE_CSMA | ADVERTISE_PAUSE_CAP);
asix_mdio_write(dev->net, dev->mii.phy_id, MII_CTRL1000,
-- 
2.7.4

[PATCH v3 2/5] net: asix: Avoid looping when the device is disconnected

2016-08-25 Thread robert . foss

From: Robert Foss 

From: Vincent Palatin 

Check the answers from the USB stack and avoid re-sending multiple times
the request if the device has disappeared.

Signed-off-by: Vincent Palatin 
Signed-off-by: Robert Foss 
Tested-by: Robert Foss 
---
 drivers/net/usb/asix_common.c  | 56 +++
 drivers/net/usb/asix_devices.c |  2 +-
 2 files changed, 46 insertions(+), 12 deletions(-)

diff --git a/drivers/net/usb/asix_common.c b/drivers/net/usb/asix_common.c
index 25609ee..f79eb12 100644
--- a/drivers/net/usb/asix_common.c
+++ b/drivers/net/usb/asix_common.c
@@ -428,13 +428,21 @@ int asix_mdio_read(struct net_device *netdev, int phy_id, 
int loc)
__le16 res;
u8 smsr;
int i = 0;
+   int ret;
 
mutex_lock(>phy_mutex);
do {
-   asix_set_sw_mii(dev, 0);
+   ret = asix_set_sw_mii(dev, 0);
+   if (ret == -ENODEV)
+   break;
usleep_range(1000, 1100);
-   asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG, 0, 0, 1, , 0);
-   } while (!(smsr & AX_HOST_EN) && (i++ < 30));
+   ret = asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG,
+   0, 0, 1, , 0);
+   } while (!(smsr & AX_HOST_EN) && (i++ < 30) && (ret != -ENODEV));
+   if (ret == -ENODEV) {
+   mutex_unlock(>phy_mutex);
+   return ret;
+   }
 
asix_read_cmd(dev, AX_CMD_READ_MII_REG, phy_id,
(__u16)loc, 2, , 0);
@@ -453,16 +461,24 @@ void asix_mdio_write(struct net_device *netdev, int 
phy_id, int loc, int val)
__le16 res = cpu_to_le16(val);
u8 smsr;
int i = 0;
+   int ret;
 
netdev_dbg(dev->net, "asix_mdio_write() phy_id=0x%02x, loc=0x%02x, 
val=0x%04x\n",
phy_id, loc, val);
 
mutex_lock(>phy_mutex);
do {
-   asix_set_sw_mii(dev, 0);
+   ret = asix_set_sw_mii(dev, 0);
+   if (ret == -ENODEV)
+   break;
usleep_range(1000, 1100);
-   asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG, 0, 0, 1, , 0);
-   } while (!(smsr & AX_HOST_EN) && (i++ < 30));
+   ret = asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG,
+   0, 0, 1, , 0);
+   } while (!(smsr & AX_HOST_EN) && (i++ < 30) && (ret != -ENODEV));
+   if (ret == -ENODEV) {
+   mutex_unlock(>phy_mutex);
+   return;
+   }
 
asix_write_cmd(dev, AX_CMD_WRITE_MII_REG, phy_id,
   (__u16)loc, 2, , 0);
@@ -476,13 +492,21 @@ int asix_mdio_read_nopm(struct net_device *netdev, int 
phy_id, int loc)
__le16 res;
u8 smsr;
int i = 0;
+   int ret;
 
mutex_lock(>phy_mutex);
do {
-   asix_set_sw_mii(dev, 1);
+   ret = asix_set_sw_mii(dev, 1);
+   if (ret == -ENODEV)
+   break;
usleep_range(1000, 1100);
-   asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG, 0, 0, 1, , 1);
-   } while (!(smsr & AX_HOST_EN) && (i++ < 30));
+   ret = asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG,
+   0, 0, 1, , 1);
+   } while (!(smsr & AX_HOST_EN) && (i++ < 30) && (ret != -ENODEV));
+   if (ret == -ENODEV) {
+   mutex_unlock(>phy_mutex);
+   return ret;
+   }
 
asix_read_cmd(dev, AX_CMD_READ_MII_REG, phy_id,
  (__u16)loc, 2, , 1);
@@ -502,16 +526,24 @@ asix_mdio_write_nopm(struct net_device *netdev, int 
phy_id, int loc, int val)
__le16 res = cpu_to_le16(val);
u8 smsr;
int i = 0;
+   int ret;
 
netdev_dbg(dev->net, "asix_mdio_write() phy_id=0x%02x, loc=0x%02x, 
val=0x%04x\n",
phy_id, loc, val);
 
mutex_lock(>phy_mutex);
do {
-   asix_set_sw_mii(dev, 1);
+   ret = asix_set_sw_mii(dev, 1);
+   if (ret == -ENODEV)
+   break;
usleep_range(1000, 1100);
-   asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG, 0, 0, 1, , 1);
-   } while (!(smsr & AX_HOST_EN) && (i++ < 30));
+   ret = asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG,
+   0, 0, 1, , 1);
+   } while (!(smsr & AX_HOST_EN) && (i++ < 30) && (ret != -ENODEV));
+   if (ret == -ENODEV) {
+   mutex_unlock(>phy_mutex);
+   return;
+   }
 
asix_write_cmd(dev, AX_CMD_WRITE_MII_REG, phy_id,
   (__u16)loc, 2, , 1);
diff --git a/drivers/net/usb/asix_devices.c b/drivers/net/usb/asix_devices.c
index aaa4290..ebeb730 100644

[PATCH v3 3/5] net: asix: Fix AX88772x resume failures

2016-08-25 Thread robert . foss

From: Robert Foss 

From: Allan Chou 

The change fixes AX88772x resume failure by
- Restore incorrect AX88772A PHY registers when resetting
- Need to stop MAC operation when suspending
- Need to restart MII when restoring PHY

Signed-off-by: Allan Chou 
Signed-off-by: Robert Foss 
Tested-by: Robert Foss 
---
 drivers/net/usb/asix_devices.c | 47 ++-
 1 file changed, 46 insertions(+), 1 deletion(-)

diff --git a/drivers/net/usb/asix_devices.c b/drivers/net/usb/asix_devices.c
index ebeb730..083dc2e 100644
--- a/drivers/net/usb/asix_devices.c
+++ b/drivers/net/usb/asix_devices.c
@@ -35,6 +35,15 @@
 
 #definePHY_MODE_RTL8211CL  0x000C
 
+#define AX88772A_PHY14H0x14
+#define AX88772A_PHY14H_DEFAULT 0x442C
+
+#define AX88772A_PHY15H0x15
+#define AX88772A_PHY15H_DEFAULT 0x03C8
+
+#define AX88772A_PHY16H0x16
+#define AX88772A_PHY16H_DEFAULT 0x4044
+
 struct ax88172_int_data {
__le16 res1;
u8 link;
@@ -424,7 +433,7 @@ static int ax88772a_hw_reset(struct usbnet *dev, int in_pm)
 {
struct asix_data *data = (struct asix_data *)>data;
int ret, embd_phy;
-   u16 rx_ctl;
+   u16 rx_ctl, phy14h, phy15h, phy16h;
u8 chipcode = 0;
 
ret = asix_write_gpio(dev, AX_GPIO_RSE, 5, in_pm);
@@ -482,6 +491,32 @@ static int ax88772a_hw_reset(struct usbnet *dev, int in_pm)
   ret);
goto out;
}
+   } else if ((chipcode & AX_CHIPCODE_MASK) == AX_AX88772A_CHIPCODE) {
+   /* Check if the PHY registers have default settings */
+   phy14h = asix_mdio_read_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY14H);
+   phy15h = asix_mdio_read_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY15H);
+   phy16h = asix_mdio_read_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY16H);
+
+   netdev_dbg(dev->net,
+  "772a_hw_reset: MR20=0x%x MR21=0x%x MR22=0x%x\n",
+  phy14h, phy15h, phy16h);
+
+   /* Restore PHY registers default setting if not */
+   if (phy14h != AX88772A_PHY14H_DEFAULT)
+   asix_mdio_write_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY14H,
+AX88772A_PHY14H_DEFAULT);
+   if (phy15h != AX88772A_PHY15H_DEFAULT)
+   asix_mdio_write_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY15H,
+AX88772A_PHY15H_DEFAULT);
+   if (phy16h != AX88772A_PHY16H_DEFAULT)
+   asix_mdio_write_nopm(dev->net, dev->mii.phy_id,
+AX88772A_PHY16H,
+AX88772A_PHY16H_DEFAULT);
}
 
ret = asix_write_cmd(dev, AX_CMD_WRITE_IPG0,
@@ -543,6 +578,15 @@ static const struct net_device_ops ax88772_netdev_ops = {
 static void ax88772_suspend(struct usbnet *dev)
 {
struct asix_common_private *priv = dev->driver_priv;
+   u16 medium;
+
+   /* Stop MAC operation */
+   medium = asix_read_medium_status(dev, 0);
+   medium &= ~AX_MEDIUM_RE;
+   asix_write_medium_mode(dev, medium, 0);
+
+   netdev_dbg(dev->net, "ax88772_suspend: medium=0x%04x\n",
+  asix_read_medium_status(dev, 0));
 
/* Preserve BMCR for restoring */
priv->presvd_phy_bmcr =
@@ -577,6 +621,7 @@ static void ax88772_restore_phy(struct usbnet *dev)
asix_mdio_write_nopm(dev->net, dev->mii.phy_id, MII_BMCR,
 priv->presvd_phy_bmcr);
 
+   mii_nway_restart(>mii);
priv->presvd_phy_advertise = 0;
priv->presvd_phy_bmcr = 0;
}
-- 
git-series 0.8.10

[PATCH v3 5/5] net: asix: autoneg will set WRITE_MEDIUM reg

2016-08-25 Thread robert . foss

From: Grant Grundler <grund...@chromium.org>

The miii_nway_restart() causes a PHY link change activity and
ax88772_link_reset will be called. link_reset will set
AX_CMD_WRITE_MEDIUM_MODE register correctly.

The asix_write_medium_mode in reset() fills in a default value to the register
which may be different from the negotiation result. So do this first.

Ignore the ret value since it's ignored in XXX_link_reset() functions.

Signed-off-by: Grant Grundler <grund...@google.com>
Signed-off-by: Robert Foss <robert.f...@collabora.com>
Tested-by: Robert Foss <robert.f...@collabora.com>
---
 drivers/net/usb/asix_devices.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/net/usb/asix_devices.c b/drivers/net/usb/asix_devices.c
index dbcdda2..cce2495 100644
--- a/drivers/net/usb/asix_devices.c
+++ b/drivers/net/usb/asix_devices.c
@@ -928,12 +928,9 @@ static int ax88178_reset(struct usbnet *dev)
asix_mdio_write(dev->net, dev->mii.phy_id, MII_CTRL1000,
ADVERTISE_1000FULL);
 
+   asix_write_medium_mode(dev, AX88178_MEDIUM_DEFAULT, 0);
mii_nway_restart(>mii);
 
-   ret = asix_write_medium_mode(dev, AX88178_MEDIUM_DEFAULT, 0);
-   if (ret < 0)
-   return ret;
-
/* Rewrite MAC address */
memcpy(data->mac_addr, dev->net->dev_addr, ETH_ALEN);
ret = asix_write_cmd(dev, AX_CMD_WRITE_NODE_ID, 0, 0, ETH_ALEN,
-- 
2.7.4

[PATCH v3 0/5] net/usb: asix driver improvements

2016-08-25 Thread robert . foss

From: Robert Foss <robert.f...@collabora.com>

This series improves power management of the asix driver.

 - Suspend/resume support is improved to save needed registers.
 - Device disconnection is improved.
 - Fixes AX88772x resume failures
 - Implementes IEEE 802.3 spec section "22.2.4.1.1 Reset" correctly
 - Fixes AX_CMD_WRITE_MEDIUM_MODE being set incorrectly

Changes since v1:
- Added proper metadata tags to series.
- Added two more patches to series.

Changes since v2:
- Added coverletter
- Tested patches on AX88772A/AX88772B/AX88178/AX88179 hardware

Allan Chou (1):
  net: asix: Fix AX88772x resume failures

Freddy Xin (1):
  net: asix: Add in_pm parameter

Grant Grundler (2):
  net: asix: see 802.3 spec for phy reset
  net: asix: autoneg will set WRITE_MEDIUM reg

Vincent Palatin (1):
  net: asix: Avoid looping when the device is disconnected

 drivers/net/usb/asix.h |  40 ++-
 drivers/net/usb/asix_common.c  | 212 
 drivers/net/usb/asix_devices.c | 450 +++---
 drivers/net/usb/ax88172a.c |  29 +-
 4 files changed, 575 insertions(+), 156 deletions(-)

-- 
git-series 0.8.10

[PATCH v3 2/5] net: asix: Avoid looping when the device is disconnected

2016-08-25 Thread robert . foss

From: Vincent Palatin <vpala...@chromium.org>

Check the answers from the USB stack and avoid re-sending multiple times
the request if the device has disappeared.

Signed-off-by: Vincent Palatin <vpala...@chromium.org>
Signed-off-by: Robert Foss <robert.f...@collabora.com>
Tested-by: Robert Foss <robert.f...@collabora.com>
---
 drivers/net/usb/asix_common.c  | 56 +-
 drivers/net/usb/asix_devices.c |  2 ++
 2 files changed, 46 insertions(+), 12 deletions(-)

diff --git a/drivers/net/usb/asix_common.c b/drivers/net/usb/asix_common.c
index 25609ee..f79eb12 100644
--- a/drivers/net/usb/asix_common.c
+++ b/drivers/net/usb/asix_common.c
@@ -428,13 +428,21 @@ int asix_mdio_read(struct net_device *netdev, int phy_id, 
int loc)
__le16 res;
u8 smsr;
int i = 0;
+   int ret;
 
mutex_lock(>phy_mutex);
do {
-   asix_set_sw_mii(dev, 0);
+   ret = asix_set_sw_mii(dev, 0);
+   if (ret == -ENODEV)
+   break;
usleep_range(1000, 1100);
-   asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG, 0, 0, 1, , 0);
-   } while (!(smsr & AX_HOST_EN) && (i++ < 30));
+   ret = asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG,
+   0, 0, 1, , 0);
+   } while (!(smsr & AX_HOST_EN) && (i++ < 30) && (ret != -ENODEV));
+   if (ret == -ENODEV) {
+   mutex_unlock(>phy_mutex);
+   return ret;
+   }
 
asix_read_cmd(dev, AX_CMD_READ_MII_REG, phy_id,
(__u16)loc, 2, , 0);
@@ -453,16 +461,24 @@ void asix_mdio_write(struct net_device *netdev, int 
phy_id, int loc, int val)
__le16 res = cpu_to_le16(val);
u8 smsr;
int i = 0;
+   int ret;
 
netdev_dbg(dev->net, "asix_mdio_write() phy_id=0x%02x, loc=0x%02x, 
val=0x%04x\n",
phy_id, loc, val);
 
mutex_lock(>phy_mutex);
do {
-   asix_set_sw_mii(dev, 0);
+   ret = asix_set_sw_mii(dev, 0);
+   if (ret == -ENODEV)
+   break;
usleep_range(1000, 1100);
-   asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG, 0, 0, 1, , 0);
-   } while (!(smsr & AX_HOST_EN) && (i++ < 30));
+   ret = asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG,
+   0, 0, 1, , 0);
+   } while (!(smsr & AX_HOST_EN) && (i++ < 30) && (ret != -ENODEV));
+   if (ret == -ENODEV) {
+   mutex_unlock(>phy_mutex);
+   return;
+   }
 
asix_write_cmd(dev, AX_CMD_WRITE_MII_REG, phy_id,
   (__u16)loc, 2, , 0);
@@ -476,13 +492,21 @@ int asix_mdio_read_nopm(struct net_device *netdev, int 
phy_id, int loc)
__le16 res;
u8 smsr;
int i = 0;
+   int ret;
 
mutex_lock(>phy_mutex);
do {
-   asix_set_sw_mii(dev, 1);
+   ret = asix_set_sw_mii(dev, 1);
+   if (ret == -ENODEV)
+   break;
usleep_range(1000, 1100);
-   asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG, 0, 0, 1, , 1);
-   } while (!(smsr & AX_HOST_EN) && (i++ < 30));
+   ret = asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG,
+   0, 0, 1, , 1);
+   } while (!(smsr & AX_HOST_EN) && (i++ < 30) && (ret != -ENODEV));
+   if (ret == -ENODEV) {
+   mutex_unlock(>phy_mutex);
+   return ret;
+   }
 
asix_read_cmd(dev, AX_CMD_READ_MII_REG, phy_id,
  (__u16)loc, 2, , 1);
@@ -502,16 +526,24 @@ asix_mdio_write_nopm(struct net_device *netdev, int 
phy_id, int loc, int val)
__le16 res = cpu_to_le16(val);
u8 smsr;
int i = 0;
+   int ret;
 
netdev_dbg(dev->net, "asix_mdio_write() phy_id=0x%02x, loc=0x%02x, 
val=0x%04x\n",
phy_id, loc, val);
 
mutex_lock(>phy_mutex);
do {
-   asix_set_sw_mii(dev, 1);
+   ret = asix_set_sw_mii(dev, 1);
+   if (ret == -ENODEV)
+   break;
usleep_range(1000, 1100);
-   asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG, 0, 0, 1, , 1);
-   } while (!(smsr & AX_HOST_EN) && (i++ < 30));
+   ret = asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG,
+   0, 0, 1, , 1);
+   } while (!(smsr & AX_HOST_EN) && (i++ < 30) && (ret != -ENODEV));
+   if (ret == -ENODEV) {
+   mutex_unlock(>phy_mutex);
+   return;
+   }
 
asix_write_cmd(dev, AX_CMD_WRITE_MII_REG, phy_id,
   (__u16)loc, 2, , 1);
diff --git a

[PATCH v3 5/5] net: asix: autoneg will set WRITE_MEDIUM reg

2016-08-25 Thread robert . foss

From: Grant Grundler 

The miii_nway_restart() causes a PHY link change activity and
ax88772_link_reset will be called. link_reset will set
AX_CMD_WRITE_MEDIUM_MODE register correctly.

The asix_write_medium_mode in reset() fills in a default value to the register
which may be different from the negotiation result. So do this first.

Ignore the ret value since it's ignored in XXX_link_reset() functions.

Signed-off-by: Grant Grundler 
Signed-off-by: Robert Foss 
Tested-by: Robert Foss 
---
 drivers/net/usb/asix_devices.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/net/usb/asix_devices.c b/drivers/net/usb/asix_devices.c
index dbcdda2..cce2495 100644
--- a/drivers/net/usb/asix_devices.c
+++ b/drivers/net/usb/asix_devices.c
@@ -928,12 +928,9 @@ static int ax88178_reset(struct usbnet *dev)
asix_mdio_write(dev->net, dev->mii.phy_id, MII_CTRL1000,
ADVERTISE_1000FULL);
 
+   asix_write_medium_mode(dev, AX88178_MEDIUM_DEFAULT, 0);
mii_nway_restart(>mii);
 
-   ret = asix_write_medium_mode(dev, AX88178_MEDIUM_DEFAULT, 0);
-   if (ret < 0)
-   return ret;
-
/* Rewrite MAC address */
memcpy(data->mac_addr, dev->net->dev_addr, ETH_ALEN);
ret = asix_write_cmd(dev, AX_CMD_WRITE_NODE_ID, 0, 0, ETH_ALEN,
-- 
2.7.4

[PATCH v3 0/5] net/usb: asix driver improvements

2016-08-25 Thread robert . foss

From: Robert Foss 

This series improves power management of the asix driver.

 - Suspend/resume support is improved to save needed registers.
 - Device disconnection is improved.
 - Fixes AX88772x resume failures
 - Implementes IEEE 802.3 spec section "22.2.4.1.1 Reset" correctly
 - Fixes AX_CMD_WRITE_MEDIUM_MODE being set incorrectly

Changes since v1:
- Added proper metadata tags to series.
- Added two more patches to series.

Changes since v2:
- Added coverletter
- Tested patches on AX88772A/AX88772B/AX88178/AX88179 hardware

Allan Chou (1):
  net: asix: Fix AX88772x resume failures

Freddy Xin (1):
  net: asix: Add in_pm parameter

Grant Grundler (2):
  net: asix: see 802.3 spec for phy reset
  net: asix: autoneg will set WRITE_MEDIUM reg

Vincent Palatin (1):
  net: asix: Avoid looping when the device is disconnected

 drivers/net/usb/asix.h |  40 ++-
 drivers/net/usb/asix_common.c  | 212 
 drivers/net/usb/asix_devices.c | 450 +++---
 drivers/net/usb/ax88172a.c |  29 +-
 4 files changed, 575 insertions(+), 156 deletions(-)

-- 
git-series 0.8.10

[PATCH v3 2/5] net: asix: Avoid looping when the device is disconnected

2016-08-25 Thread robert . foss

From: Vincent Palatin 

Check the answers from the USB stack and avoid re-sending multiple times
the request if the device has disappeared.

Signed-off-by: Vincent Palatin 
Signed-off-by: Robert Foss 
Tested-by: Robert Foss 
---
 drivers/net/usb/asix_common.c  | 56 +-
 drivers/net/usb/asix_devices.c |  2 ++
 2 files changed, 46 insertions(+), 12 deletions(-)

diff --git a/drivers/net/usb/asix_common.c b/drivers/net/usb/asix_common.c
index 25609ee..f79eb12 100644
--- a/drivers/net/usb/asix_common.c
+++ b/drivers/net/usb/asix_common.c
@@ -428,13 +428,21 @@ int asix_mdio_read(struct net_device *netdev, int phy_id, 
int loc)
__le16 res;
u8 smsr;
int i = 0;
+   int ret;
 
mutex_lock(>phy_mutex);
do {
-   asix_set_sw_mii(dev, 0);
+   ret = asix_set_sw_mii(dev, 0);
+   if (ret == -ENODEV)
+   break;
usleep_range(1000, 1100);
-   asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG, 0, 0, 1, , 0);
-   } while (!(smsr & AX_HOST_EN) && (i++ < 30));
+   ret = asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG,
+   0, 0, 1, , 0);
+   } while (!(smsr & AX_HOST_EN) && (i++ < 30) && (ret != -ENODEV));
+   if (ret == -ENODEV) {
+   mutex_unlock(>phy_mutex);
+   return ret;
+   }
 
asix_read_cmd(dev, AX_CMD_READ_MII_REG, phy_id,
(__u16)loc, 2, , 0);
@@ -453,16 +461,24 @@ void asix_mdio_write(struct net_device *netdev, int 
phy_id, int loc, int val)
__le16 res = cpu_to_le16(val);
u8 smsr;
int i = 0;
+   int ret;
 
netdev_dbg(dev->net, "asix_mdio_write() phy_id=0x%02x, loc=0x%02x, 
val=0x%04x\n",
phy_id, loc, val);
 
mutex_lock(>phy_mutex);
do {
-   asix_set_sw_mii(dev, 0);
+   ret = asix_set_sw_mii(dev, 0);
+   if (ret == -ENODEV)
+   break;
usleep_range(1000, 1100);
-   asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG, 0, 0, 1, , 0);
-   } while (!(smsr & AX_HOST_EN) && (i++ < 30));
+   ret = asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG,
+   0, 0, 1, , 0);
+   } while (!(smsr & AX_HOST_EN) && (i++ < 30) && (ret != -ENODEV));
+   if (ret == -ENODEV) {
+   mutex_unlock(>phy_mutex);
+   return;
+   }
 
asix_write_cmd(dev, AX_CMD_WRITE_MII_REG, phy_id,
   (__u16)loc, 2, , 0);
@@ -476,13 +492,21 @@ int asix_mdio_read_nopm(struct net_device *netdev, int 
phy_id, int loc)
__le16 res;
u8 smsr;
int i = 0;
+   int ret;
 
mutex_lock(>phy_mutex);
do {
-   asix_set_sw_mii(dev, 1);
+   ret = asix_set_sw_mii(dev, 1);
+   if (ret == -ENODEV)
+   break;
usleep_range(1000, 1100);
-   asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG, 0, 0, 1, , 1);
-   } while (!(smsr & AX_HOST_EN) && (i++ < 30));
+   ret = asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG,
+   0, 0, 1, , 1);
+   } while (!(smsr & AX_HOST_EN) && (i++ < 30) && (ret != -ENODEV));
+   if (ret == -ENODEV) {
+   mutex_unlock(>phy_mutex);
+   return ret;
+   }
 
asix_read_cmd(dev, AX_CMD_READ_MII_REG, phy_id,
  (__u16)loc, 2, , 1);
@@ -502,16 +526,24 @@ asix_mdio_write_nopm(struct net_device *netdev, int 
phy_id, int loc, int val)
__le16 res = cpu_to_le16(val);
u8 smsr;
int i = 0;
+   int ret;
 
netdev_dbg(dev->net, "asix_mdio_write() phy_id=0x%02x, loc=0x%02x, 
val=0x%04x\n",
phy_id, loc, val);
 
mutex_lock(>phy_mutex);
do {
-   asix_set_sw_mii(dev, 1);
+   ret = asix_set_sw_mii(dev, 1);
+   if (ret == -ENODEV)
+   break;
usleep_range(1000, 1100);
-   asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG, 0, 0, 1, , 1);
-   } while (!(smsr & AX_HOST_EN) && (i++ < 30));
+   ret = asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG,
+   0, 0, 1, , 1);
+   } while (!(smsr & AX_HOST_EN) && (i++ < 30) && (ret != -ENODEV));
+   if (ret == -ENODEV) {
+   mutex_unlock(>phy_mutex);
+   return;
+   }
 
asix_write_cmd(dev, AX_CMD_WRITE_MII_REG, phy_id,
   (__u16)loc, 2, , 1);
diff --git a/drivers/net/usb/asix_devices.c b/drivers/net/usb/asix_devices.c
index aaa4290..ebeb730 100644
--- a/drivers/net/usb/asix

Re: [PACTH,v6,1/2] usb: xhci: plat: Enable runtime PM

2016-08-24 Thread Robert Foss




On 2016-08-22 11:23 PM, Brian Norris wrote:

+ others

Hi Robert and Felipe,

I have a few questions for one or both of you. I'm not really an expert
on runtime PM, so please take my questions with a grain of salt.

On Wed, Aug 10, 2016 at 04:32:15PM -0400, robert.f...@collabora.com wrote:

From: Robert Foss <robert.f...@collabora.com>

Enable runtime PM for the xhci-plat device so that the parent device
may implement runtime PM.

Signed-off-by: Robert Foss <robert.f...@collabora.com>

Tested-by: Robert Foss <robert.f...@collabora.com>
---
 drivers/usb/host/xhci-plat.c | 29 +++--
 1 file changed, 27 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/host/xhci-plat.c b/drivers/usb/host/xhci-plat.c
index ed56bf9..ba4efe7 100644
--- a/drivers/usb/host/xhci-plat.c
+++ b/drivers/usb/host/xhci-plat.c
@@ -246,6 +246,9 @@ static int xhci_plat_probe(struct platform_device *pdev)
if (ret)
goto dealloc_usb2_hcd;

+   pm_runtime_set_active(>dev);
+   pm_runtime_enable(>dev);
+


How does it help to enable PM runtime like this, if you don't have any
kind of runtime_{suspend,resume}() callbacks?


Andrew, I think you understand the inner workings of this code better 
than me, maybe you could give a short summary?




I suspect that this patch set was derived from the Chromium OS kernel
tree, where we were supporting a Tegra XHCI chipset:

https://chromium.googlesource.com/chromiumos/third_party/kernel/+/chromeos-3.10/drivers/usb/host/xhci-tegra.c#1920

It looks like the driver was refactored to not use xhci-plat.c before it
was upstreamed (and runtime PM support was dropped along the way).

So, I'm wondering how I might actually use this? Particularly, I'm
looking at trying out runtime suspend for a DWC3 controller in host
mode, and it looks like I'd have to do some layer-violating calls to
xhci_suspend()/xhci_resume() from the parent dwc3 device, or else
rewrite drivers/usb/dwc3/host.c to avoid using xhci-plat.c.

(I also see that Baolin, CC'd here, was interested in dwc3 [1].)

Or possibly an enlightening question for me: if you don't mind, how are
you utilizing runtime PM in conjunction with xhci-plat.c, Robert?
Presumably some other parent device/driver is doing some additional
management of the XHCI core?

Regards,
Brian

[1] [PATCH 4/4] usb: dwc3: core: Support the dwc3 host suspend/resume
https://lkml.org/lkml/2016/7/15/181
https://patchwork.kernel.org/patch/9231417/


return 0;


@@ -274,6 +277,8 @@ static int xhci_plat_remove(struct platform_device *dev)
struct xhci_hcd *xhci = hcd_to_xhci(hcd);
struct clk *clk = xhci->clk;

+   pm_runtime_disable(>dev);
+
usb_remove_hcd(xhci->shared_hcd);
usb_phy_shutdown(hcd->usb_phy);

@@ -292,6 +297,13 @@ static int xhci_plat_suspend(struct device *dev)
 {
struct usb_hcd  *hcd = dev_get_drvdata(dev);
struct xhci_hcd *xhci = hcd_to_xhci(hcd);
+   int ret;
+
+   ret = pm_runtime_get_sync(dev);
+   if (ret < 0) {
+   pm_runtime_put(dev);
+   return ret;
+   }

/*
 * xhci_suspend() needs `do_wakeup` to know whether host is allowed
@@ -301,15 +313,28 @@ static int xhci_plat_suspend(struct device *dev)
 * reconsider this when xhci_plat_suspend enlarges its scope, e.g.,
 * also applies to runtime suspend.
 */
-   return xhci_suspend(xhci, device_may_wakeup(dev));
+   ret = xhci_suspend(xhci, device_may_wakeup(dev));
+   pm_runtime_put(dev);
+
+   return ret;
 }

 static int xhci_plat_resume(struct device *dev)
 {
struct usb_hcd  *hcd = dev_get_drvdata(dev);
struct xhci_hcd *xhci = hcd_to_xhci(hcd);
+   int ret;

-   return xhci_resume(xhci, 0);
+   ret = pm_runtime_get_sync(dev);
+   if (ret < 0) {
+   pm_runtime_put(dev);
+   return ret;
+   }
+
+   ret = xhci_resume(xhci, 0);
+   pm_runtime_put(dev);
+
+   return ret;
 }

 static const struct dev_pm_ops xhci_plat_pm_ops = {

Re: [PACTH,v6,1/2] usb: xhci: plat: Enable runtime PM

2016-08-24 Thread Robert Foss




On 2016-08-22 11:23 PM, Brian Norris wrote:

+ others

Hi Robert and Felipe,

I have a few questions for one or both of you. I'm not really an expert
on runtime PM, so please take my questions with a grain of salt.

On Wed, Aug 10, 2016 at 04:32:15PM -0400, robert.f...@collabora.com wrote:

From: Robert Foss 

Enable runtime PM for the xhci-plat device so that the parent device
may implement runtime PM.

Signed-off-by: Robert Foss 

Tested-by: Robert Foss 
---
 drivers/usb/host/xhci-plat.c | 29 +++--
 1 file changed, 27 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/host/xhci-plat.c b/drivers/usb/host/xhci-plat.c
index ed56bf9..ba4efe7 100644
--- a/drivers/usb/host/xhci-plat.c
+++ b/drivers/usb/host/xhci-plat.c
@@ -246,6 +246,9 @@ static int xhci_plat_probe(struct platform_device *pdev)
if (ret)
goto dealloc_usb2_hcd;

+   pm_runtime_set_active(>dev);
+   pm_runtime_enable(>dev);
+


How does it help to enable PM runtime like this, if you don't have any
kind of runtime_{suspend,resume}() callbacks?


Andrew, I think you understand the inner workings of this code better 
than me, maybe you could give a short summary?




I suspect that this patch set was derived from the Chromium OS kernel
tree, where we were supporting a Tegra XHCI chipset:

https://chromium.googlesource.com/chromiumos/third_party/kernel/+/chromeos-3.10/drivers/usb/host/xhci-tegra.c#1920

It looks like the driver was refactored to not use xhci-plat.c before it
was upstreamed (and runtime PM support was dropped along the way).

So, I'm wondering how I might actually use this? Particularly, I'm
looking at trying out runtime suspend for a DWC3 controller in host
mode, and it looks like I'd have to do some layer-violating calls to
xhci_suspend()/xhci_resume() from the parent dwc3 device, or else
rewrite drivers/usb/dwc3/host.c to avoid using xhci-plat.c.

(I also see that Baolin, CC'd here, was interested in dwc3 [1].)

Or possibly an enlightening question for me: if you don't mind, how are
you utilizing runtime PM in conjunction with xhci-plat.c, Robert?
Presumably some other parent device/driver is doing some additional
management of the XHCI core?

Regards,
Brian

[1] [PATCH 4/4] usb: dwc3: core: Support the dwc3 host suspend/resume
https://lkml.org/lkml/2016/7/15/181
https://patchwork.kernel.org/patch/9231417/


return 0;


@@ -274,6 +277,8 @@ static int xhci_plat_remove(struct platform_device *dev)
struct xhci_hcd *xhci = hcd_to_xhci(hcd);
struct clk *clk = xhci->clk;

+   pm_runtime_disable(>dev);
+
usb_remove_hcd(xhci->shared_hcd);
usb_phy_shutdown(hcd->usb_phy);

@@ -292,6 +297,13 @@ static int xhci_plat_suspend(struct device *dev)
 {
struct usb_hcd  *hcd = dev_get_drvdata(dev);
struct xhci_hcd *xhci = hcd_to_xhci(hcd);
+   int ret;
+
+   ret = pm_runtime_get_sync(dev);
+   if (ret < 0) {
+   pm_runtime_put(dev);
+   return ret;
+   }

/*
 * xhci_suspend() needs `do_wakeup` to know whether host is allowed
@@ -301,15 +313,28 @@ static int xhci_plat_suspend(struct device *dev)
 * reconsider this when xhci_plat_suspend enlarges its scope, e.g.,
 * also applies to runtime suspend.
 */
-   return xhci_suspend(xhci, device_may_wakeup(dev));
+   ret = xhci_suspend(xhci, device_may_wakeup(dev));
+   pm_runtime_put(dev);
+
+   return ret;
 }

 static int xhci_plat_resume(struct device *dev)
 {
struct usb_hcd  *hcd = dev_get_drvdata(dev);
struct xhci_hcd *xhci = hcd_to_xhci(hcd);
+   int ret;

-   return xhci_resume(xhci, 0);
+   ret = pm_runtime_get_sync(dev);
+   if (ret < 0) {
+   pm_runtime_put(dev);
+   return ret;
+   }
+
+   ret = xhci_resume(xhci, 0);
+   pm_runtime_put(dev);
+
+   return ret;
 }

 static const struct dev_pm_ops xhci_plat_pm_ops = {

Re: [PACTH v2 0/3] Implement /proc//totmaps

2016-08-22 Thread Robert Foss




On 2016-08-22 10:12 AM, Minchan Kim wrote:

On Mon, Aug 22, 2016 at 09:40:52AM +0200, Michal Hocko wrote:

On Mon 22-08-16 09:07:45, Minchan Kim wrote:
[...]

#!/bin/sh
./smap_test &
pid=$!

for i in $(seq 25)
do
awk '/^Rss/{rss+=$2} /^Pss/{pss+=$2} END {}' \
 /proc/$pid/smaps
done
kill $pid

root@bbox:/home/barrios/test/smap# time ./s.sh
pid:21973

real0m17.812s
user0m12.612s
sys 0m5.187s


retested on the bare metal (x86_64 - 2CPUs)
Command being timed: "sh s.sh"
User time (seconds): 0.00
System time (seconds): 18.08
Percent of CPU this job got: 98%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:18.29

multiple runs are quite consistent in those numbers. I am running with
$ awk --version
GNU Awk 4.1.3, API: 1.1 (GNU MPFR 3.1.4, GNU MP 6.1.0)



$ ./smap_test &
pid:19658 nr_vma:65514

$ time awk '/^Rss/{rss+=$2} /^Pss/{pss+=$2} END {printf "rss:%d 
pss:%d\n", rss, pss}' /proc/19658/smaps

rss:263452 pss:262151

real0m0.625s
user0m0.404s
sys 0m0.216s

$ awk --version
GNU Awk 4.1.3, API: 1.1 (GNU MPFR 3.1.4, GNU MP 6.1.0)


like a problem we are not able to address. And I would even argue that
we want to address it in a generic way as much as possible.


Sure. What solution do you think as generic way?


either optimize seq_printf or replace it with something faster.


If it's real culprit, I agree. However, I tested your test program on
my 2 x86 machines and my friend's machine.

Ubuntu, Fedora, Arch

They have awk 4.0.1 and 4.1.3.

Result are same. Userspace speand more times I mentioned.

[root@blaptop smap_test]# time awk '/^Rss/{rss+=$2} /^Pss/{pss+=$2} END {printf 
"rss:%d pss:%d\n", rss, pss}' /proc/3552/smaps
rss:263484 pss:262188

real0m0.770s
user0m0.574s
sys 0m0.197s

I will attach my test progrma source.
I hope you guys test and repost the result because it's the key for direction
of patchset.

Thanks.

Re: [PACTH v2 0/3] Implement /proc//totmaps

2016-08-22 Thread Robert Foss




On 2016-08-22 10:12 AM, Minchan Kim wrote:

On Mon, Aug 22, 2016 at 09:40:52AM +0200, Michal Hocko wrote:

On Mon 22-08-16 09:07:45, Minchan Kim wrote:
[...]

#!/bin/sh
./smap_test &
pid=$!

for i in $(seq 25)
do
awk '/^Rss/{rss+=$2} /^Pss/{pss+=$2} END {}' \
 /proc/$pid/smaps
done
kill $pid

root@bbox:/home/barrios/test/smap# time ./s.sh
pid:21973

real0m17.812s
user0m12.612s
sys 0m5.187s


retested on the bare metal (x86_64 - 2CPUs)
Command being timed: "sh s.sh"
User time (seconds): 0.00
System time (seconds): 18.08
Percent of CPU this job got: 98%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:18.29

multiple runs are quite consistent in those numbers. I am running with
$ awk --version
GNU Awk 4.1.3, API: 1.1 (GNU MPFR 3.1.4, GNU MP 6.1.0)



$ ./smap_test &
pid:19658 nr_vma:65514

$ time awk '/^Rss/{rss+=$2} /^Pss/{pss+=$2} END {printf "rss:%d 
pss:%d\n", rss, pss}' /proc/19658/smaps

rss:263452 pss:262151

real0m0.625s
user0m0.404s
sys 0m0.216s

$ awk --version
GNU Awk 4.1.3, API: 1.1 (GNU MPFR 3.1.4, GNU MP 6.1.0)


like a problem we are not able to address. And I would even argue that
we want to address it in a generic way as much as possible.


Sure. What solution do you think as generic way?


either optimize seq_printf or replace it with something faster.


If it's real culprit, I agree. However, I tested your test program on
my 2 x86 machines and my friend's machine.

Ubuntu, Fedora, Arch

They have awk 4.0.1 and 4.1.3.

Result are same. Userspace speand more times I mentioned.

[root@blaptop smap_test]# time awk '/^Rss/{rss+=$2} /^Pss/{pss+=$2} END {printf 
"rss:%d pss:%d\n", rss, pss}' /proc/3552/smaps
rss:263484 pss:262188

real0m0.770s
user0m0.574s
sys 0m0.197s

I will attach my test progrma source.
I hope you guys test and repost the result because it's the key for direction
of patchset.

Thanks.

Re: [PACTH v2 0/3] Implement /proc//totmaps

2016-08-18 Thread Robert Foss




On 2016-08-18 02:01 PM, Michal Hocko wrote:

On Thu 18-08-16 10:47:57, Sonny Rao wrote:

On Thu, Aug 18, 2016 at 12:44 AM, Michal Hocko  wrote:

On Wed 17-08-16 11:57:56, Sonny Rao wrote:

[...]

2) User space OOM handling -- we'd rather do a more graceful shutdown
than let the kernel's OOM killer activate and need to gather this
information and we'd like to be able to get this information to make
the decision much faster than 400ms


Global OOM handling in userspace is really dubious if you ask me. I
understand you want something better than SIGKILL and in fact this is
already possible with memory cgroup controller (btw. memcg will give
you a cheap access to rss, amount of shared, swapped out memory as
well). Anyway if you are getting close to the OOM your system will most
probably be really busy and chances are that also reading your new file
will take much more time. I am also not quite sure how is pss useful for
oom decisions.


I mentioned it before, but based on experience RSS just isn't good
enough -- there's too much sharing going on in our use case to make
the correct decision based on RSS.  If RSS were good enough, simply
put, this patch wouldn't exist.


But that doesn't answer my question, I am afraid. So how exactly do you
use pss for oom decisions?


So even with memcg I think we'd have the same problem?


memcg will give you instant anon, shared counters for all processes in
the memcg.


Is it technically feasible to add instant pss support to memcg?

@Sonny Rao: Would using cgroups be acceptable for chromiumos?




Don't take me wrong, /proc//totmaps might be suitable for your
specific usecase but so far I haven't heard any sound argument for it to
be generally usable. It is true that smaps is unnecessarily costly but
at least I can see some room for improvements. A simple patch I've
posted cut the formatting overhead by 7%. Maybe we can do more.


It seems like a general problem that if you want these values the
existing kernel interface can be very expensive, so it would be
generally usable by any application which wants a per process PSS,
private data, dirty data or swap value.


yes this is really unfortunate. And if at all possible we should address
that. Precise values require the expensive rmap walk. We can introduce
some caching to help that. But so far it seems the biggest overhead is
to simply format the output and that should be addressed before any new
proc file is added.


I mentioned two use cases, but I guess I don't understand the comment
about why it's not usable by other use cases.


I might be wrong here but a use of pss is quite limited and I do not
remember anybody asking for large optimizations in that area. I still do
not understand your use cases properly so I am quite skeptical about a
general usefulness of a new file.

Re: [PACTH v2 0/3] Implement /proc//totmaps

2016-08-18 Thread Robert Foss




On 2016-08-18 02:01 PM, Michal Hocko wrote:

On Thu 18-08-16 10:47:57, Sonny Rao wrote:

On Thu, Aug 18, 2016 at 12:44 AM, Michal Hocko  wrote:

On Wed 17-08-16 11:57:56, Sonny Rao wrote:

[...]

2) User space OOM handling -- we'd rather do a more graceful shutdown
than let the kernel's OOM killer activate and need to gather this
information and we'd like to be able to get this information to make
the decision much faster than 400ms


Global OOM handling in userspace is really dubious if you ask me. I
understand you want something better than SIGKILL and in fact this is
already possible with memory cgroup controller (btw. memcg will give
you a cheap access to rss, amount of shared, swapped out memory as
well). Anyway if you are getting close to the OOM your system will most
probably be really busy and chances are that also reading your new file
will take much more time. I am also not quite sure how is pss useful for
oom decisions.


I mentioned it before, but based on experience RSS just isn't good
enough -- there's too much sharing going on in our use case to make
the correct decision based on RSS.  If RSS were good enough, simply
put, this patch wouldn't exist.


But that doesn't answer my question, I am afraid. So how exactly do you
use pss for oom decisions?


So even with memcg I think we'd have the same problem?


memcg will give you instant anon, shared counters for all processes in
the memcg.


Is it technically feasible to add instant pss support to memcg?

@Sonny Rao: Would using cgroups be acceptable for chromiumos?




Don't take me wrong, /proc//totmaps might be suitable for your
specific usecase but so far I haven't heard any sound argument for it to
be generally usable. It is true that smaps is unnecessarily costly but
at least I can see some room for improvements. A simple patch I've
posted cut the formatting overhead by 7%. Maybe we can do more.


It seems like a general problem that if you want these values the
existing kernel interface can be very expensive, so it would be
generally usable by any application which wants a per process PSS,
private data, dirty data or swap value.


yes this is really unfortunate. And if at all possible we should address
that. Precise values require the expensive rmap walk. We can introduce
some caching to help that. But so far it seems the biggest overhead is
to simply format the output and that should be addressed before any new
proc file is added.


I mentioned two use cases, but I guess I don't understand the comment
about why it's not usable by other use cases.


I might be wrong here but a use of pss is quite limited and I do not
remember anybody asking for large optimizations in that area. I still do
not understand your use cases properly so I am quite skeptical about a
general usefulness of a new file.

Re: [PACTH v9] stacktrace: Eliminate task stack trace duplication

2016-08-17 Thread Robert Foss




On 2016-08-17 03:57 PM, Josh Poimboeuf wrote:

On Wed, Aug 17, 2016 at 02:41:30PM -0400, Robert Foss wrote:

On 2016-08-17 02:06 PM, Josh Poimboeuf wrote:

On Wed, Aug 17, 2016 at 01:40:33PM -0400, Robert Foss wrote:

On 2016-08-17 12:58 PM, Josh Poimboeuf wrote:

On Wed, Aug 17, 2016 at 09:51:45AM -0400, Robert Foss wrote:

On 2016-08-17 02:50 AM, Peter Zijlstra wrote:

On Tue, Aug 16, 2016 at 07:12:36PM -0400, robert.f...@collabora.com wrote:

From: Ying Han <ying...@google.com>

The problem with small dmesg ring buffer like 512k is that only limited number
of task traces will be logged. Sometimes we lose important information only
because of too many duplicated stack traces. This problem occurs when dumping
lots of stacks in a single operation, such as sysrq-T.

This patch tries to reduce the duplication of task stack trace in the dump
message by hashing the task stack. The hashtable is a 32k pre-allocated buffer
during bootup. Each time if we find the identical task trace in the task stack,
we dump only the pid of the task which has the task trace dumped. So it is easy
to back track to the full stack with the pid.

When we do the hashing, we eliminate garbage entries from stack traces. Those
entries are still being printed in the dump to provide more debugging
informations.

[   53.510162] kworker/0:0 S 8161d820 0 4  2 0x
[   53.517237]  88027547de60 0046 812ab840 

[   53.524663]  880275460080 88027547dfd8 88027547dfd8 
88027547dfd8
[   53.532092]  81813020 880275460080  
8808758670c0
[   53.539521] Call Trace:
[   53.541974]  [] ? cfq_init_queue+0x350/0x350
[   53.547791]  [] schedule+0x29/0x70
[   53.552761]  [] worker_thread+0x233/0x380
[   53.558318]  [] ? manage_workers.isra.28+0x230/0x230
[   53.564839]  [] kthread+0x93/0xa0
[   53.569714]  [] kernel_thread_helper+0x4/0x10
[   53.575628]  [] ? kthread_worker_fn+0x140/0x140
[   53.581714]  [] ? gs_change+0xb/0xb
[   53.586762] kworker/u:0 S 8161d820 0 5  2 0x
[   53.593858]  88027547fe60 0046 a005cc70 

[   53.601307]  8802754627d0 88027547ffd8 88027547ffd8 
88027547ffd8
[   53.608788]  81813020 8802754627d0 00011fc0 
8804758670c0
[   53.616232] Call Trace:
[   53.618676] 



You might want to wait a bit and have a look at this:

  https://lkml.kernel.org/r/cover.1471011425.git.jpoim...@redhat.com



I'll have a look through that series!
Thanks!


Yeah, those patches replace dump_trace() with a new unwinder interface,
so if they get merged, this will need to be rewritten a little bit.

As for the patch itself, I'm not crazy about how it pushes the decision
of whether to print the stack of a given task down to the stack dump
code in show_trace_log_lvl().

I think I'd prefer to instead change the implementation of sysrq-T so
that it uses save_stack_trace_tsk(), and then uses
printk_stack_address() to print the stack.  Then the stack dump code in
dumpstack*.c would be completely unaffected.

Or, even better, instead of sysrq-T, can the user just read
/proc/*/{comm,stack} and /proc/sched_debug?  That gives basically the
same information without flooding printk.



Thanks for the feedback Josh!

I think the save_stack_trace_tsk() changes you are suggesting sound very
reasonable. However requiring the user to read /proc/*/{comm,stack} sort of
circumnavigates the goal of the patch, which is to reduce clutter in the
default stack traces that one encounters.


Yes, but maybe the hashing and deduplication of stacks could also be
done in user space?



What would that look like in practice? A user space daemon running in the
background?


The idea was for the user of sysrq-T to instead get the stack
information from /proc.  Then they wouldn't have problems with the
printk buffer wrapping.  If after doing that, the dedupe is still needed
for some reason, the application which does the reads from /proc could
also do the dedupe.


Ah! Now I understand. Thanks for spelling it out :)

The dedup would be helpful for as long as there is duplication in stack 
traces. And the dedup would be helpful to any application. Making every 
application that interacts with stack traces implement deduplication 
seems to me like going about things the wrong way,


But given that stack traces currently are being re-worked it may not be 
helpful after that series has landed.




Or am I misunderstanding the problem?  If so, it might help to spell out
the problem you're trying to solve with more specifics and explain why
it can't be solved in user space.



Nope, you are on point.

Re: [PACTH v9] stacktrace: Eliminate task stack trace duplication

2016-08-17 Thread Robert Foss




On 2016-08-17 03:57 PM, Josh Poimboeuf wrote:

On Wed, Aug 17, 2016 at 02:41:30PM -0400, Robert Foss wrote:

On 2016-08-17 02:06 PM, Josh Poimboeuf wrote:

On Wed, Aug 17, 2016 at 01:40:33PM -0400, Robert Foss wrote:

On 2016-08-17 12:58 PM, Josh Poimboeuf wrote:

On Wed, Aug 17, 2016 at 09:51:45AM -0400, Robert Foss wrote:

On 2016-08-17 02:50 AM, Peter Zijlstra wrote:

On Tue, Aug 16, 2016 at 07:12:36PM -0400, robert.f...@collabora.com wrote:

From: Ying Han 

The problem with small dmesg ring buffer like 512k is that only limited number
of task traces will be logged. Sometimes we lose important information only
because of too many duplicated stack traces. This problem occurs when dumping
lots of stacks in a single operation, such as sysrq-T.

This patch tries to reduce the duplication of task stack trace in the dump
message by hashing the task stack. The hashtable is a 32k pre-allocated buffer
during bootup. Each time if we find the identical task trace in the task stack,
we dump only the pid of the task which has the task trace dumped. So it is easy
to back track to the full stack with the pid.

When we do the hashing, we eliminate garbage entries from stack traces. Those
entries are still being printed in the dump to provide more debugging
informations.

[   53.510162] kworker/0:0 S 8161d820 0 4  2 0x
[   53.517237]  88027547de60 0046 812ab840 

[   53.524663]  880275460080 88027547dfd8 88027547dfd8 
88027547dfd8
[   53.532092]  81813020 880275460080  
8808758670c0
[   53.539521] Call Trace:
[   53.541974]  [] ? cfq_init_queue+0x350/0x350
[   53.547791]  [] schedule+0x29/0x70
[   53.552761]  [] worker_thread+0x233/0x380
[   53.558318]  [] ? manage_workers.isra.28+0x230/0x230
[   53.564839]  [] kthread+0x93/0xa0
[   53.569714]  [] kernel_thread_helper+0x4/0x10
[   53.575628]  [] ? kthread_worker_fn+0x140/0x140
[   53.581714]  [] ? gs_change+0xb/0xb
[   53.586762] kworker/u:0 S 8161d820 0 5  2 0x
[   53.593858]  88027547fe60 0046 a005cc70 

[   53.601307]  8802754627d0 88027547ffd8 88027547ffd8 
88027547ffd8
[   53.608788]  81813020 8802754627d0 00011fc0 
8804758670c0
[   53.616232] Call Trace:
[   53.618676] 



You might want to wait a bit and have a look at this:

  https://lkml.kernel.org/r/cover.1471011425.git.jpoim...@redhat.com



I'll have a look through that series!
Thanks!


Yeah, those patches replace dump_trace() with a new unwinder interface,
so if they get merged, this will need to be rewritten a little bit.

As for the patch itself, I'm not crazy about how it pushes the decision
of whether to print the stack of a given task down to the stack dump
code in show_trace_log_lvl().

I think I'd prefer to instead change the implementation of sysrq-T so
that it uses save_stack_trace_tsk(), and then uses
printk_stack_address() to print the stack.  Then the stack dump code in
dumpstack*.c would be completely unaffected.

Or, even better, instead of sysrq-T, can the user just read
/proc/*/{comm,stack} and /proc/sched_debug?  That gives basically the
same information without flooding printk.



Thanks for the feedback Josh!

I think the save_stack_trace_tsk() changes you are suggesting sound very
reasonable. However requiring the user to read /proc/*/{comm,stack} sort of
circumnavigates the goal of the patch, which is to reduce clutter in the
default stack traces that one encounters.


Yes, but maybe the hashing and deduplication of stacks could also be
done in user space?



What would that look like in practice? A user space daemon running in the
background?


The idea was for the user of sysrq-T to instead get the stack
information from /proc.  Then they wouldn't have problems with the
printk buffer wrapping.  If after doing that, the dedupe is still needed
for some reason, the application which does the reads from /proc could
also do the dedupe.


Ah! Now I understand. Thanks for spelling it out :)

The dedup would be helpful for as long as there is duplication in stack 
traces. And the dedup would be helpful to any application. Making every 
application that interacts with stack traces implement deduplication 
seems to me like going about things the wrong way,


But given that stack traces currently are being re-worked it may not be 
helpful after that series has landed.




Or am I misunderstanding the problem?  If so, it might help to spell out
the problem you're trying to solve with more specifics and explain why
it can't be solved in user space.



Nope, you are on point.

Re: [PACTH v9] stacktrace: Eliminate task stack trace duplication

2016-08-17 Thread Robert Foss




On 2016-08-17 02:06 PM, Josh Poimboeuf wrote:

On Wed, Aug 17, 2016 at 01:40:33PM -0400, Robert Foss wrote:



On 2016-08-17 12:58 PM, Josh Poimboeuf wrote:

On Wed, Aug 17, 2016 at 09:51:45AM -0400, Robert Foss wrote:



On 2016-08-17 02:50 AM, Peter Zijlstra wrote:

On Tue, Aug 16, 2016 at 07:12:36PM -0400, robert.f...@collabora.com wrote:

From: Ying Han <ying...@google.com>

The problem with small dmesg ring buffer like 512k is that only limited number
of task traces will be logged. Sometimes we lose important information only
because of too many duplicated stack traces. This problem occurs when dumping
lots of stacks in a single operation, such as sysrq-T.

This patch tries to reduce the duplication of task stack trace in the dump
message by hashing the task stack. The hashtable is a 32k pre-allocated buffer
during bootup. Each time if we find the identical task trace in the task stack,
we dump only the pid of the task which has the task trace dumped. So it is easy
to back track to the full stack with the pid.

When we do the hashing, we eliminate garbage entries from stack traces. Those
entries are still being printed in the dump to provide more debugging
informations.

[   53.510162] kworker/0:0 S 8161d820 0 4  2 0x
[   53.517237]  88027547de60 0046 812ab840 

[   53.524663]  880275460080 88027547dfd8 88027547dfd8 
88027547dfd8
[   53.532092]  81813020 880275460080  
8808758670c0
[   53.539521] Call Trace:
[   53.541974]  [] ? cfq_init_queue+0x350/0x350
[   53.547791]  [] schedule+0x29/0x70
[   53.552761]  [] worker_thread+0x233/0x380
[   53.558318]  [] ? manage_workers.isra.28+0x230/0x230
[   53.564839]  [] kthread+0x93/0xa0
[   53.569714]  [] kernel_thread_helper+0x4/0x10
[   53.575628]  [] ? kthread_worker_fn+0x140/0x140
[   53.581714]  [] ? gs_change+0xb/0xb
[   53.586762] kworker/u:0 S 8161d820 0 5  2 0x
[   53.593858]  88027547fe60 0046 a005cc70 

[   53.601307]  8802754627d0 88027547ffd8 88027547ffd8 
88027547ffd8
[   53.608788]  81813020 8802754627d0 00011fc0 
8804758670c0
[   53.616232] Call Trace:
[   53.618676] 



You might want to wait a bit and have a look at this:

  https://lkml.kernel.org/r/cover.1471011425.git.jpoim...@redhat.com



I'll have a look through that series!
Thanks!


Yeah, those patches replace dump_trace() with a new unwinder interface,
so if they get merged, this will need to be rewritten a little bit.

As for the patch itself, I'm not crazy about how it pushes the decision
of whether to print the stack of a given task down to the stack dump
code in show_trace_log_lvl().

I think I'd prefer to instead change the implementation of sysrq-T so
that it uses save_stack_trace_tsk(), and then uses
printk_stack_address() to print the stack.  Then the stack dump code in
dumpstack*.c would be completely unaffected.

Or, even better, instead of sysrq-T, can the user just read
/proc/*/{comm,stack} and /proc/sched_debug?  That gives basically the
same information without flooding printk.



Thanks for the feedback Josh!

I think the save_stack_trace_tsk() changes you are suggesting sound very
reasonable. However requiring the user to read /proc/*/{comm,stack} sort of
circumnavigates the goal of the patch, which is to reduce clutter in the
default stack traces that one encounters.


Yes, but maybe the hashing and deduplication of stacks could also be
done in user space?



What would that look like in practice? A user space daemon running in 
the background?

Re: [PACTH v9] stacktrace: Eliminate task stack trace duplication

2016-08-17 Thread Robert Foss




On 2016-08-17 02:06 PM, Josh Poimboeuf wrote:

On Wed, Aug 17, 2016 at 01:40:33PM -0400, Robert Foss wrote:



On 2016-08-17 12:58 PM, Josh Poimboeuf wrote:

On Wed, Aug 17, 2016 at 09:51:45AM -0400, Robert Foss wrote:



On 2016-08-17 02:50 AM, Peter Zijlstra wrote:

On Tue, Aug 16, 2016 at 07:12:36PM -0400, robert.f...@collabora.com wrote:

From: Ying Han 

The problem with small dmesg ring buffer like 512k is that only limited number
of task traces will be logged. Sometimes we lose important information only
because of too many duplicated stack traces. This problem occurs when dumping
lots of stacks in a single operation, such as sysrq-T.

This patch tries to reduce the duplication of task stack trace in the dump
message by hashing the task stack. The hashtable is a 32k pre-allocated buffer
during bootup. Each time if we find the identical task trace in the task stack,
we dump only the pid of the task which has the task trace dumped. So it is easy
to back track to the full stack with the pid.

When we do the hashing, we eliminate garbage entries from stack traces. Those
entries are still being printed in the dump to provide more debugging
informations.

[   53.510162] kworker/0:0 S 8161d820 0 4  2 0x
[   53.517237]  88027547de60 0046 812ab840 

[   53.524663]  880275460080 88027547dfd8 88027547dfd8 
88027547dfd8
[   53.532092]  81813020 880275460080  
8808758670c0
[   53.539521] Call Trace:
[   53.541974]  [] ? cfq_init_queue+0x350/0x350
[   53.547791]  [] schedule+0x29/0x70
[   53.552761]  [] worker_thread+0x233/0x380
[   53.558318]  [] ? manage_workers.isra.28+0x230/0x230
[   53.564839]  [] kthread+0x93/0xa0
[   53.569714]  [] kernel_thread_helper+0x4/0x10
[   53.575628]  [] ? kthread_worker_fn+0x140/0x140
[   53.581714]  [] ? gs_change+0xb/0xb
[   53.586762] kworker/u:0 S 8161d820 0 5  2 0x
[   53.593858]  88027547fe60 0046 a005cc70 

[   53.601307]  8802754627d0 88027547ffd8 88027547ffd8 
88027547ffd8
[   53.608788]  81813020 8802754627d0 00011fc0 
8804758670c0
[   53.616232] Call Trace:
[   53.618676] 



You might want to wait a bit and have a look at this:

  https://lkml.kernel.org/r/cover.1471011425.git.jpoim...@redhat.com



I'll have a look through that series!
Thanks!


Yeah, those patches replace dump_trace() with a new unwinder interface,
so if they get merged, this will need to be rewritten a little bit.

As for the patch itself, I'm not crazy about how it pushes the decision
of whether to print the stack of a given task down to the stack dump
code in show_trace_log_lvl().

I think I'd prefer to instead change the implementation of sysrq-T so
that it uses save_stack_trace_tsk(), and then uses
printk_stack_address() to print the stack.  Then the stack dump code in
dumpstack*.c would be completely unaffected.

Or, even better, instead of sysrq-T, can the user just read
/proc/*/{comm,stack} and /proc/sched_debug?  That gives basically the
same information without flooding printk.



Thanks for the feedback Josh!

I think the save_stack_trace_tsk() changes you are suggesting sound very
reasonable. However requiring the user to read /proc/*/{comm,stack} sort of
circumnavigates the goal of the patch, which is to reduce clutter in the
default stack traces that one encounters.


Yes, but maybe the hashing and deduplication of stacks could also be
done in user space?



What would that look like in practice? A user space daemon running in 
the background?

Re: [PACTH v9] stacktrace: Eliminate task stack trace duplication

2016-08-17 Thread Robert Foss




On 2016-08-17 12:58 PM, Josh Poimboeuf wrote:

On Wed, Aug 17, 2016 at 09:51:45AM -0400, Robert Foss wrote:



On 2016-08-17 02:50 AM, Peter Zijlstra wrote:

On Tue, Aug 16, 2016 at 07:12:36PM -0400, robert.f...@collabora.com wrote:

From: Ying Han <ying...@google.com>

The problem with small dmesg ring buffer like 512k is that only limited number
of task traces will be logged. Sometimes we lose important information only
because of too many duplicated stack traces. This problem occurs when dumping
lots of stacks in a single operation, such as sysrq-T.

This patch tries to reduce the duplication of task stack trace in the dump
message by hashing the task stack. The hashtable is a 32k pre-allocated buffer
during bootup. Each time if we find the identical task trace in the task stack,
we dump only the pid of the task which has the task trace dumped. So it is easy
to back track to the full stack with the pid.

When we do the hashing, we eliminate garbage entries from stack traces. Those
entries are still being printed in the dump to provide more debugging
informations.

[   53.510162] kworker/0:0 S 8161d820 0 4  2 0x
[   53.517237]  88027547de60 0046 812ab840 

[   53.524663]  880275460080 88027547dfd8 88027547dfd8 
88027547dfd8
[   53.532092]  81813020 880275460080  
8808758670c0
[   53.539521] Call Trace:
[   53.541974]  [] ? cfq_init_queue+0x350/0x350
[   53.547791]  [] schedule+0x29/0x70
[   53.552761]  [] worker_thread+0x233/0x380
[   53.558318]  [] ? manage_workers.isra.28+0x230/0x230
[   53.564839]  [] kthread+0x93/0xa0
[   53.569714]  [] kernel_thread_helper+0x4/0x10
[   53.575628]  [] ? kthread_worker_fn+0x140/0x140
[   53.581714]  [] ? gs_change+0xb/0xb
[   53.586762] kworker/u:0 S 8161d820 0 5  2 0x
[   53.593858]  88027547fe60 0046 a005cc70 

[   53.601307]  8802754627d0 88027547ffd8 88027547ffd8 
88027547ffd8
[   53.608788]  81813020 8802754627d0 00011fc0 
8804758670c0
[   53.616232] Call Trace:
[   53.618676] 



You might want to wait a bit and have a look at this:

  https://lkml.kernel.org/r/cover.1471011425.git.jpoim...@redhat.com



I'll have a look through that series!
Thanks!


Yeah, those patches replace dump_trace() with a new unwinder interface,
so if they get merged, this will need to be rewritten a little bit.

As for the patch itself, I'm not crazy about how it pushes the decision
of whether to print the stack of a given task down to the stack dump
code in show_trace_log_lvl().

I think I'd prefer to instead change the implementation of sysrq-T so
that it uses save_stack_trace_tsk(), and then uses
printk_stack_address() to print the stack.  Then the stack dump code in
dumpstack*.c would be completely unaffected.

Or, even better, instead of sysrq-T, can the user just read
/proc/*/{comm,stack} and /proc/sched_debug?  That gives basically the
same information without flooding printk.



Thanks for the feedback Josh!

I think the save_stack_trace_tsk() changes you are suggesting sound very 
reasonable. However requiring the user to read /proc/*/{comm,stack} sort 
of circumnavigates the goal of the patch, which is to reduce clutter in 
the default stack traces that one encounters.


I'll put this patch on the back-burner until the above mentioned series 
either lands or is discarded.



Rob.

Re: [PACTH v9] stacktrace: Eliminate task stack trace duplication

2016-08-17 Thread Robert Foss




On 2016-08-17 12:58 PM, Josh Poimboeuf wrote:

On Wed, Aug 17, 2016 at 09:51:45AM -0400, Robert Foss wrote:



On 2016-08-17 02:50 AM, Peter Zijlstra wrote:

On Tue, Aug 16, 2016 at 07:12:36PM -0400, robert.f...@collabora.com wrote:

From: Ying Han 

The problem with small dmesg ring buffer like 512k is that only limited number
of task traces will be logged. Sometimes we lose important information only
because of too many duplicated stack traces. This problem occurs when dumping
lots of stacks in a single operation, such as sysrq-T.

This patch tries to reduce the duplication of task stack trace in the dump
message by hashing the task stack. The hashtable is a 32k pre-allocated buffer
during bootup. Each time if we find the identical task trace in the task stack,
we dump only the pid of the task which has the task trace dumped. So it is easy
to back track to the full stack with the pid.

When we do the hashing, we eliminate garbage entries from stack traces. Those
entries are still being printed in the dump to provide more debugging
informations.

[   53.510162] kworker/0:0 S 8161d820 0 4  2 0x
[   53.517237]  88027547de60 0046 812ab840 

[   53.524663]  880275460080 88027547dfd8 88027547dfd8 
88027547dfd8
[   53.532092]  81813020 880275460080  
8808758670c0
[   53.539521] Call Trace:
[   53.541974]  [] ? cfq_init_queue+0x350/0x350
[   53.547791]  [] schedule+0x29/0x70
[   53.552761]  [] worker_thread+0x233/0x380
[   53.558318]  [] ? manage_workers.isra.28+0x230/0x230
[   53.564839]  [] kthread+0x93/0xa0
[   53.569714]  [] kernel_thread_helper+0x4/0x10
[   53.575628]  [] ? kthread_worker_fn+0x140/0x140
[   53.581714]  [] ? gs_change+0xb/0xb
[   53.586762] kworker/u:0 S 8161d820 0 5  2 0x
[   53.593858]  88027547fe60 0046 a005cc70 

[   53.601307]  8802754627d0 88027547ffd8 88027547ffd8 
88027547ffd8
[   53.608788]  81813020 8802754627d0 00011fc0 
8804758670c0
[   53.616232] Call Trace:
[   53.618676] 



You might want to wait a bit and have a look at this:

  https://lkml.kernel.org/r/cover.1471011425.git.jpoim...@redhat.com



I'll have a look through that series!
Thanks!


Yeah, those patches replace dump_trace() with a new unwinder interface,
so if they get merged, this will need to be rewritten a little bit.

As for the patch itself, I'm not crazy about how it pushes the decision
of whether to print the stack of a given task down to the stack dump
code in show_trace_log_lvl().

I think I'd prefer to instead change the implementation of sysrq-T so
that it uses save_stack_trace_tsk(), and then uses
printk_stack_address() to print the stack.  Then the stack dump code in
dumpstack*.c would be completely unaffected.

Or, even better, instead of sysrq-T, can the user just read
/proc/*/{comm,stack} and /proc/sched_debug?  That gives basically the
same information without flooding printk.



Thanks for the feedback Josh!

I think the save_stack_trace_tsk() changes you are suggesting sound very 
reasonable. However requiring the user to read /proc/*/{comm,stack} sort 
of circumnavigates the goal of the patch, which is to reduce clutter in 
the default stack traces that one encounters.


I'll put this patch on the back-burner until the above mentioned series 
either lands or is discarded.



Rob.

Re: [PACTH v3] mmc: sdhci: Do not allow tuning procedure to be interrupted

2016-08-17 Thread Robert Foss




On 2016-08-17 06:47 AM, Adrian Hunter wrote:

On 17/08/16 00:25, robert.f...@collabora.com wrote:

From: Christopher Freeman <cfree...@nvidia.com>

wait_event_interruptible_timeout() will return early if the blocked
process receives a signal, causing the driver to abort the tuning
procedure and possibly leaving the controller in a bad state.  Since the
tuning command is expected to complete quickly (<50ms) and we've set a
timeout, use wait_event_timeout() instead.

Signed-off-by: Christopher Freeman <cfree...@nvidia.com>
Tested-by: Robert Foss <robert.f...@collabora.com>
Signed-off-by: Robert Foss <robert.f...@collabora.com>
Reviewed-by: Benson Leung <ble...@chromium.org>


The mmc block queues are kernel threads which I would expect ignore signals,
so I am curious how you hit this?


The issue was discovered on (tegra2?) hardware that is sensitive to 
being interrupted during tuning and having the controller left in a 
sensitive state.


@Christopher Freeman: Maybe you can provide us with some additional details?



In any case:

Acked-by: Adrian Hunter <adrian.hun...@intel.com>

Re: [PACTH v3] mmc: sdhci: Do not allow tuning procedure to be interrupted

2016-08-17 Thread Robert Foss




On 2016-08-17 06:47 AM, Adrian Hunter wrote:

On 17/08/16 00:25, robert.f...@collabora.com wrote:

From: Christopher Freeman 

wait_event_interruptible_timeout() will return early if the blocked
process receives a signal, causing the driver to abort the tuning
procedure and possibly leaving the controller in a bad state.  Since the
tuning command is expected to complete quickly (<50ms) and we've set a
timeout, use wait_event_timeout() instead.

Signed-off-by: Christopher Freeman 
Tested-by: Robert Foss 
Signed-off-by: Robert Foss 
Reviewed-by: Benson Leung 


The mmc block queues are kernel threads which I would expect ignore signals,
so I am curious how you hit this?


The issue was discovered on (tegra2?) hardware that is sensitive to 
being interrupted during tuning and having the controller left in a 
sensitive state.


@Christopher Freeman: Maybe you can provide us with some additional details?



In any case:

Acked-by: Adrian Hunter

[PACTH v4] mmc: sdhci: Do not allow tuning procedure to be interrupted

2016-08-17 Thread robert . foss

From: Christopher Freeman <cfree...@nvidia.com>

wait_event_interruptible_timeout() will return early if the blocked
process receives a signal, causing the driver to abort the tuning
procedure and possibly leaving the controller in a bad state.  Since the
tuning command is expected to complete quickly (<50ms) and we've set a
timeout, use wait_event_timeout() instead.

Signed-off-by: Christopher Freeman <cfree...@nvidia.com>
Tested-by: Robert Foss <robert.f...@collabora.com>
Signed-off-by: Robert Foss <robert.f...@collabora.com>
Reviewed-by: Benson Leung <ble...@chromium.org>
Acked-by: Adrian Hunter <adrian.hun...@intel.com>
---

Changes since v1:
- Added proper metadata tags to series.

Changes since v2:
- Added "Reviewed-by: Benson Leung <ble...@chromium.org>"

Changes since v3:
- Added "Acked-by: Adrian Hunter <adrian.hun...@intel.com>"

 drivers/mmc/host/sdhci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 0e3d7c0..9e80203 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -1960,7 +1960,7 @@ static int sdhci_execute_tuning(struct mmc_host *mmc, u32 
opcode)
 
spin_unlock_irqrestore(>lock, flags);
/* Wait for Buffer Read Ready interrupt */
-   wait_event_interruptible_timeout(host->buf_ready_int,
+   wait_event_timeout(host->buf_ready_int,
(host->tuning_done == 1),
msecs_to_jiffies(50));
spin_lock_irqsave(>lock, flags);
-- 
2.7.4

[PACTH v4] mmc: sdhci: Do not allow tuning procedure to be interrupted

2016-08-17 Thread robert . foss

From: Christopher Freeman 

wait_event_interruptible_timeout() will return early if the blocked
process receives a signal, causing the driver to abort the tuning
procedure and possibly leaving the controller in a bad state.  Since the
tuning command is expected to complete quickly (<50ms) and we've set a
timeout, use wait_event_timeout() instead.

Signed-off-by: Christopher Freeman 
Tested-by: Robert Foss 
Signed-off-by: Robert Foss 
Reviewed-by: Benson Leung 
Acked-by: Adrian Hunter 
---

Changes since v1:
- Added proper metadata tags to series.

Changes since v2:
- Added "Reviewed-by: Benson Leung "

Changes since v3:
- Added "Acked-by: Adrian Hunter "

 drivers/mmc/host/sdhci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 0e3d7c0..9e80203 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -1960,7 +1960,7 @@ static int sdhci_execute_tuning(struct mmc_host *mmc, u32 
opcode)
 
spin_unlock_irqrestore(>lock, flags);
/* Wait for Buffer Read Ready interrupt */
-   wait_event_interruptible_timeout(host->buf_ready_int,
+   wait_event_timeout(host->buf_ready_int,
(host->tuning_done == 1),
msecs_to_jiffies(50));
spin_lock_irqsave(>lock, flags);
-- 
2.7.4

Re: [PACTH v2 0/3] Implement /proc//totmaps

2016-08-17 Thread Robert Foss




On 2016-08-17 09:03 AM, Michal Hocko wrote:

On Wed 17-08-16 11:31:25, Jann Horn wrote:

On Wed, Aug 17, 2016 at 10:22:00AM +0200, Michal Hocko wrote:

On Tue 16-08-16 12:46:51, Robert Foss wrote:
[...]

$ /usr/bin/time -v -p zsh -c "repeat 25 { awk '/^Rss/{rss+=\$2}
/^Pss/{pss+=\$2} END {printf \"rss:%d pss:%d\n\", rss, pss}\'
/proc/5025/smaps }"
[...]
Command being timed: "zsh -c repeat 25 { awk '/^Rss/{rss+=$2}
/^Pss/{pss+=$2} END {printf "rss:%d pss:%d\n", rss, pss}\' /proc/5025/smaps
}"
User time (seconds): 0.37
System time (seconds): 0.45
Percent of CPU this job got: 92%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.89


This is really unexpected. Where is the user time spent? Anyway, rather
than measuring some random processes I've tried to measure something
resembling the worst case. So I've created a simple program to mmap as
much as possible:

#include 
#include 
#include 
#include 
int main()
{
while (mmap(NULL, 4096, PROT_READ|PROT_WRITE, 
MAP_ANON|MAP_SHARED|MAP_POPULATE, -1, 0) != MAP_FAILED)
;

printf("pid:%d\n", getpid());
pause();
return 0;
}


Ah, nice, that's a reasonable test program. :)



So with a reasonable user space the parsing is really not all that time
consuming wrt. smaps handling. That being said I am still very skeptical
about a dedicated proc file which accomplishes what userspace can done
in a trivial way.


Now, since your numbers showed that all the time is spent in the kernel,
also create this test program to just read that file over and over again:

$ cat justreadloop.c
#include 
#include 
#include 
#include 
#include 
#include 
#include 

char buf[100];

int main(int argc, char **argv) {
  printf("pid:%d\n", getpid());
  while (1) {
int fd = open(argv[1], O_RDONLY);
if (fd < 0) continue;
if (read(fd, buf, sizeof(buf)) < 0)
  err(1, "read");
close(fd);
  }
}
$ gcc -Wall -o justreadloop justreadloop.c
$

Now launch your test:

$ ./mapstuff
pid:29397

point justreadloop at it:

$ ./justreadloop /proc/29397/smaps
pid:32567

... and then check the performance stats of justreadloop:

# perf top -p 32567

This is what I see:

Samples: 232K of event 'cycles:ppp', Event count (approx.): 60448424325
Overhead  Shared Object Symbol
  30,43%  [kernel]  [k] format_decode
   9,12%  [kernel]  [k] number
   7,66%  [kernel]  [k] vsnprintf
   7,06%  [kernel]  [k] __lock_acquire
   3,23%  [kernel]  [k] lock_release
   2,85%  [kernel]  [k] debug_lockdep_rcu_enabled
   2,25%  [kernel]  [k] skip_atoi
   2,13%  [kernel]  [k] lock_acquire
   2,05%  [kernel]  [k] show_smap


This is a lot! I would expect the rmap walk to consume more but it even
doesn't show up in the top consumers.


That's at least 30.43% + 9.12% + 7.66% = 47.21% of the task's kernel
time spent on evaluating format strings. The new interface
wouldn't have to spend that much time on format strings because there
isn't so much text to format.


well, this is true of course but I would much rather try to reduce the
overhead of smaps file than add a new file. The following should help
already. I've measured ~7% systime cut down. I guess there is still some
room for improvements but I have to say I'm far from being convinced about
a new proc file just because we suck at dumping information to the
userspace. If this was something like /proc//stat which is
essentially read all the time then it would be a different question but
is the rss, pss going to be all that often? If yes why? These are the
questions which should be answered before we even start considering the
implementation.


@Sonny Rao: Maybe you can comment on how often, for how many processes 
this information is needed and for which reasons this information is useful.



---
From 2a6883a7278ff8979808cb8e2dbcefe5ea3bf672 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mho...@suse.com>
Date: Wed, 17 Aug 2016 14:00:13 +0200
Subject: [PATCH] proc, smaps: reduce printing overhead

seq_printf (used by show_smap) can be pretty expensive when dumping a
lot of numbers.  Say we would like to get Rss and Pss from a particular
process.  In order to measure a pathological case let's generate as many
mappings as possible:

$ cat max_mmap.c
int main()
{
while (mmap(NULL, 4096, PROT_READ|PROT_WRITE, 
MAP_ANON|MAP_SHARED|MAP_POPULATE, -1, 0) != MAP_FAILED)
;

printf("pid:%d\n", getpid());
pause();
return 0;
}

$ awk '/^Rss/{rss+=$2} /^Pss/{pss+=$2} END {printf "rss:%d pss:%d\n", rss, 
pss}' /proc/$pid/smaps

would do a trick. The whole runtime is in the kernel space which is not
that that unexpected because smaps is not the cheapest one (we have to
do rmap walk etc.).

Command being timed: "awk /^Rss/{rss+=$2} /^Pss/{pss+=$2} END {p

Re: [PACTH v2 0/3] Implement /proc//totmaps

2016-08-17 Thread Robert Foss




On 2016-08-17 09:03 AM, Michal Hocko wrote:

On Wed 17-08-16 11:31:25, Jann Horn wrote:

On Wed, Aug 17, 2016 at 10:22:00AM +0200, Michal Hocko wrote:

On Tue 16-08-16 12:46:51, Robert Foss wrote:
[...]

$ /usr/bin/time -v -p zsh -c "repeat 25 { awk '/^Rss/{rss+=\$2}
/^Pss/{pss+=\$2} END {printf \"rss:%d pss:%d\n\", rss, pss}\'
/proc/5025/smaps }"
[...]
Command being timed: "zsh -c repeat 25 { awk '/^Rss/{rss+=$2}
/^Pss/{pss+=$2} END {printf "rss:%d pss:%d\n", rss, pss}\' /proc/5025/smaps
}"
User time (seconds): 0.37
System time (seconds): 0.45
Percent of CPU this job got: 92%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.89


This is really unexpected. Where is the user time spent? Anyway, rather
than measuring some random processes I've tried to measure something
resembling the worst case. So I've created a simple program to mmap as
much as possible:

#include 
#include 
#include 
#include 
int main()
{
while (mmap(NULL, 4096, PROT_READ|PROT_WRITE, 
MAP_ANON|MAP_SHARED|MAP_POPULATE, -1, 0) != MAP_FAILED)
;

printf("pid:%d\n", getpid());
pause();
return 0;
}


Ah, nice, that's a reasonable test program. :)



So with a reasonable user space the parsing is really not all that time
consuming wrt. smaps handling. That being said I am still very skeptical
about a dedicated proc file which accomplishes what userspace can done
in a trivial way.


Now, since your numbers showed that all the time is spent in the kernel,
also create this test program to just read that file over and over again:

$ cat justreadloop.c
#include 
#include 
#include 
#include 
#include 
#include 
#include 

char buf[100];

int main(int argc, char **argv) {
  printf("pid:%d\n", getpid());
  while (1) {
int fd = open(argv[1], O_RDONLY);
if (fd < 0) continue;
if (read(fd, buf, sizeof(buf)) < 0)
  err(1, "read");
close(fd);
  }
}
$ gcc -Wall -o justreadloop justreadloop.c
$

Now launch your test:

$ ./mapstuff
pid:29397

point justreadloop at it:

$ ./justreadloop /proc/29397/smaps
pid:32567

... and then check the performance stats of justreadloop:

# perf top -p 32567

This is what I see:

Samples: 232K of event 'cycles:ppp', Event count (approx.): 60448424325
Overhead  Shared Object Symbol
  30,43%  [kernel]  [k] format_decode
   9,12%  [kernel]  [k] number
   7,66%  [kernel]  [k] vsnprintf
   7,06%  [kernel]  [k] __lock_acquire
   3,23%  [kernel]  [k] lock_release
   2,85%  [kernel]  [k] debug_lockdep_rcu_enabled
   2,25%  [kernel]  [k] skip_atoi
   2,13%  [kernel]  [k] lock_acquire
   2,05%  [kernel]  [k] show_smap


This is a lot! I would expect the rmap walk to consume more but it even
doesn't show up in the top consumers.


That's at least 30.43% + 9.12% + 7.66% = 47.21% of the task's kernel
time spent on evaluating format strings. The new interface
wouldn't have to spend that much time on format strings because there
isn't so much text to format.


well, this is true of course but I would much rather try to reduce the
overhead of smaps file than add a new file. The following should help
already. I've measured ~7% systime cut down. I guess there is still some
room for improvements but I have to say I'm far from being convinced about
a new proc file just because we suck at dumping information to the
userspace. If this was something like /proc//stat which is
essentially read all the time then it would be a different question but
is the rss, pss going to be all that often? If yes why? These are the
questions which should be answered before we even start considering the
implementation.


@Sonny Rao: Maybe you can comment on how often, for how many processes 
this information is needed and for which reasons this information is useful.



---
From 2a6883a7278ff8979808cb8e2dbcefe5ea3bf672 Mon Sep 17 00:00:00 2001
From: Michal Hocko 
Date: Wed, 17 Aug 2016 14:00:13 +0200
Subject: [PATCH] proc, smaps: reduce printing overhead

seq_printf (used by show_smap) can be pretty expensive when dumping a
lot of numbers.  Say we would like to get Rss and Pss from a particular
process.  In order to measure a pathological case let's generate as many
mappings as possible:

$ cat max_mmap.c
int main()
{
while (mmap(NULL, 4096, PROT_READ|PROT_WRITE, 
MAP_ANON|MAP_SHARED|MAP_POPULATE, -1, 0) != MAP_FAILED)
;

printf("pid:%d\n", getpid());
pause();
return 0;
}

$ awk '/^Rss/{rss+=$2} /^Pss/{pss+=$2} END {printf "rss:%d pss:%d\n", rss, 
pss}' /proc/$pid/smaps

would do a trick. The whole runtime is in the kernel space which is not
that that unexpected because smaps is not the cheapest one (we have to
do rmap walk etc.).

Command being timed: "awk /^Rss/{rss+=$2} /^Pss/{pss+=$2} END {printf "rss:%d 
ps

Re: [PACTH v9] stacktrace: Eliminate task stack trace duplication

2016-08-17 Thread Robert Foss




On 2016-08-17 02:50 AM, Peter Zijlstra wrote:

On Tue, Aug 16, 2016 at 07:12:36PM -0400, robert.f...@collabora.com wrote:

From: Ying Han 

The problem with small dmesg ring buffer like 512k is that only limited number
of task traces will be logged. Sometimes we lose important information only
because of too many duplicated stack traces. This problem occurs when dumping
lots of stacks in a single operation, such as sysrq-T.

This patch tries to reduce the duplication of task stack trace in the dump
message by hashing the task stack. The hashtable is a 32k pre-allocated buffer
during bootup. Each time if we find the identical task trace in the task stack,
we dump only the pid of the task which has the task trace dumped. So it is easy
to back track to the full stack with the pid.

When we do the hashing, we eliminate garbage entries from stack traces. Those
entries are still being printed in the dump to provide more debugging
informations.

[   53.510162] kworker/0:0 S 8161d820 0 4  2 0x
[   53.517237]  88027547de60 0046 812ab840 

[   53.524663]  880275460080 88027547dfd8 88027547dfd8 
88027547dfd8
[   53.532092]  81813020 880275460080  
8808758670c0
[   53.539521] Call Trace:
[   53.541974]  [] ? cfq_init_queue+0x350/0x350
[   53.547791]  [] schedule+0x29/0x70
[   53.552761]  [] worker_thread+0x233/0x380
[   53.558318]  [] ? manage_workers.isra.28+0x230/0x230
[   53.564839]  [] kthread+0x93/0xa0
[   53.569714]  [] kernel_thread_helper+0x4/0x10
[   53.575628]  [] ? kthread_worker_fn+0x140/0x140
[   53.581714]  [] ? gs_change+0xb/0xb
[   53.586762] kworker/u:0 S 8161d820 0 5  2 0x
[   53.593858]  88027547fe60 0046 a005cc70 

[   53.601307]  8802754627d0 88027547ffd8 88027547ffd8 
88027547ffd8
[   53.608788]  81813020 8802754627d0 00011fc0 
8804758670c0
[   53.616232] Call Trace:
[   53.618676] 



You might want to wait a bit and have a look at this:

  https://lkml.kernel.org/r/cover.1471011425.git.jpoim...@redhat.com



I'll have a look through that series!
Thanks!

Re: [PACTH v9] stacktrace: Eliminate task stack trace duplication

2016-08-17 Thread Robert Foss




On 2016-08-17 02:50 AM, Peter Zijlstra wrote:

On Tue, Aug 16, 2016 at 07:12:36PM -0400, robert.f...@collabora.com wrote:

From: Ying Han 

The problem with small dmesg ring buffer like 512k is that only limited number
of task traces will be logged. Sometimes we lose important information only
because of too many duplicated stack traces. This problem occurs when dumping
lots of stacks in a single operation, such as sysrq-T.

This patch tries to reduce the duplication of task stack trace in the dump
message by hashing the task stack. The hashtable is a 32k pre-allocated buffer
during bootup. Each time if we find the identical task trace in the task stack,
we dump only the pid of the task which has the task trace dumped. So it is easy
to back track to the full stack with the pid.

When we do the hashing, we eliminate garbage entries from stack traces. Those
entries are still being printed in the dump to provide more debugging
informations.

[   53.510162] kworker/0:0 S 8161d820 0 4  2 0x
[   53.517237]  88027547de60 0046 812ab840 

[   53.524663]  880275460080 88027547dfd8 88027547dfd8 
88027547dfd8
[   53.532092]  81813020 880275460080  
8808758670c0
[   53.539521] Call Trace:
[   53.541974]  [] ? cfq_init_queue+0x350/0x350
[   53.547791]  [] schedule+0x29/0x70
[   53.552761]  [] worker_thread+0x233/0x380
[   53.558318]  [] ? manage_workers.isra.28+0x230/0x230
[   53.564839]  [] kthread+0x93/0xa0
[   53.569714]  [] kernel_thread_helper+0x4/0x10
[   53.575628]  [] ? kthread_worker_fn+0x140/0x140
[   53.581714]  [] ? gs_change+0xb/0xb
[   53.586762] kworker/u:0 S 8161d820 0 5  2 0x
[   53.593858]  88027547fe60 0046 a005cc70 

[   53.601307]  8802754627d0 88027547ffd8 88027547ffd8 
88027547ffd8
[   53.608788]  81813020 8802754627d0 00011fc0 
8804758670c0
[   53.616232] Call Trace:
[   53.618676] 



You might want to wait a bit and have a look at this:

  https://lkml.kernel.org/r/cover.1471011425.git.jpoim...@redhat.com



I'll have a look through that series!
Thanks!

[PACTH v9] stacktrace: Eliminate task stack trace duplication

2016-08-16 Thread robert . foss

From: Ying Han <ying...@google.com>

The problem with small dmesg ring buffer like 512k is that only limited number
of task traces will be logged. Sometimes we lose important information only
because of too many duplicated stack traces. This problem occurs when dumping
lots of stacks in a single operation, such as sysrq-T.

This patch tries to reduce the duplication of task stack trace in the dump
message by hashing the task stack. The hashtable is a 32k pre-allocated buffer
during bootup. Each time if we find the identical task trace in the task stack,
we dump only the pid of the task which has the task trace dumped. So it is easy
to back track to the full stack with the pid.

When we do the hashing, we eliminate garbage entries from stack traces. Those
entries are still being printed in the dump to provide more debugging
informations.

[   53.510162] kworker/0:0 S 8161d820 0 4  2 0x
[   53.517237]  88027547de60 0046 812ab840 

[   53.524663]  880275460080 88027547dfd8 88027547dfd8 
88027547dfd8
[   53.532092]  81813020 880275460080  
8808758670c0
[   53.539521] Call Trace:
[   53.541974]  [] ? cfq_init_queue+0x350/0x350
[   53.547791]  [] schedule+0x29/0x70
[   53.552761]  [] worker_thread+0x233/0x380
[   53.558318]  [] ? manage_workers.isra.28+0x230/0x230
[   53.564839]  [] kthread+0x93/0xa0
[   53.569714]  [] kernel_thread_helper+0x4/0x10
[   53.575628]  [] ? kthread_worker_fn+0x140/0x140
[   53.581714]  [] ? gs_change+0xb/0xb
[   53.586762] kworker/u:0 S 8161d820 0 5  2 0x
[   53.593858]  88027547fe60 0046 a005cc70 

[   53.601307]  8802754627d0 88027547ffd8 88027547ffd8 
88027547ffd8
[   53.608788]  81813020 8802754627d0 00011fc0 
8804758670c0
[   53.616232] Call Trace:
[   53.618676] 

Signed-off-by: Ying Han <ying...@google.com>
Tested-by: Robert Foss <robert.f...@collabora.com>
Signed-off-by: Robert Foss <robert.f...@collabora.com>

---

This is a resubmission of v9.

This series has previously had 8 versions submitted and v8 was acked, after
which the series was dropped on the floor.

https://lkml.org/lkml/2012/4/6/255

changelog v9..v8:
1. rebase on v4.8-trunk.
2. change return type of save_dup_stack_address

changelog v8..v7:
1. rebase on v3.4-rc1.

changelog v7..v6:
1. rebase on v3.3_rc2, the only change is moving changes from kernel/sched.c
to kernel/sched/core.c

changelog v6..v5:
1. clear saved stack trace before printing a set of stacks. this ensures the 
printed
stack traces are not omitted messages.
2. add log level in printing duplicate stack.
3. remove the show_stack() API change, and non-x86 arch won't need further 
change.
4. add more inline documentations.

changelog v5..v4:
1. removed changes to Kconfig file
2. changed hashtable to keep only hash value and length of stack
3. simplified hashtable lookup

changelog v4..v3:
1. improve de-duplication by eliminating garbage entries from stack traces.
with this change 793/825 stack traces were recognized as duplicates. in v3
only 482/839 were duplicates.

changelog v3..v2:
1. again better documentation on the patch description.
2. make the stack_hash_table to be allocated at compile time.
3. have better name of variable index
4. move save_dup_stack_trace() in kernel/stacktrace.c

changelog v2..v1:
1. better documentation on the patch description
2. move the spinlock inside the hash lockup, so reducing the holding time.

 arch/x86/include/asm/stacktrace.h |  11 ++-
 arch/x86/kernel/dumpstack.c   |  12 ++--
 arch/x86/kernel/dumpstack_32.c|   7 +-
 arch/x86/kernel/dumpstack_64.c|   7 +-
 arch/x86/kernel/stacktrace.c  | 137 ++
 include/linux/stacktrace.h|   8 +++
 kernel/sched/core.c   |  33 -
 kernel/stacktrace.c   |  22 ++
 8 files changed, 222 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/stacktrace.h 
b/arch/x86/include/asm/stacktrace.h
index 0944218..d100e69 100644
--- a/arch/x86/include/asm/stacktrace.h
+++ b/arch/x86/include/asm/stacktrace.h
@@ -81,13 +81,20 @@ stack_frame(struct task_struct *task, struct pt_regs *regs)
 }
 #endif
 
+/*
+ * The parameter dup_stack_pid is used for task stack deduplication.
+ * The non-zero value of dup_stack_pid indicates the pid of the
+ * task with the same stack trace.
+ */
 extern void
 show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
-  unsigned long *stack, unsigned long bp, char *log_lvl);
+  unsigned long *stack, unsigned long bp, char *log_lvl,
+  pid_t dup_stack_pid);
 
 extern void
 show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs,
-  unsigned long *sp, unsigned long bp, char *log_lvl);
+  unsigned long *sp

[PACTH v9] stacktrace: Eliminate task stack trace duplication

2016-08-16 Thread robert . foss

From: Ying Han 

The problem with small dmesg ring buffer like 512k is that only limited number
of task traces will be logged. Sometimes we lose important information only
because of too many duplicated stack traces. This problem occurs when dumping
lots of stacks in a single operation, such as sysrq-T.

This patch tries to reduce the duplication of task stack trace in the dump
message by hashing the task stack. The hashtable is a 32k pre-allocated buffer
during bootup. Each time if we find the identical task trace in the task stack,
we dump only the pid of the task which has the task trace dumped. So it is easy
to back track to the full stack with the pid.

When we do the hashing, we eliminate garbage entries from stack traces. Those
entries are still being printed in the dump to provide more debugging
informations.

[   53.510162] kworker/0:0 S 8161d820 0 4  2 0x
[   53.517237]  88027547de60 0046 812ab840 

[   53.524663]  880275460080 88027547dfd8 88027547dfd8 
88027547dfd8
[   53.532092]  81813020 880275460080  
8808758670c0
[   53.539521] Call Trace:
[   53.541974]  [] ? cfq_init_queue+0x350/0x350
[   53.547791]  [] schedule+0x29/0x70
[   53.552761]  [] worker_thread+0x233/0x380
[   53.558318]  [] ? manage_workers.isra.28+0x230/0x230
[   53.564839]  [] kthread+0x93/0xa0
[   53.569714]  [] kernel_thread_helper+0x4/0x10
[   53.575628]  [] ? kthread_worker_fn+0x140/0x140
[   53.581714]  [] ? gs_change+0xb/0xb
[   53.586762] kworker/u:0 S 8161d820 0 5  2 0x
[   53.593858]  88027547fe60 0046 a005cc70 

[   53.601307]  8802754627d0 88027547ffd8 88027547ffd8 
88027547ffd8
[   53.608788]  81813020 8802754627d0 00011fc0 
8804758670c0
[   53.616232] Call Trace:
[   53.618676] 

Signed-off-by: Ying Han 
Tested-by: Robert Foss 
Signed-off-by: Robert Foss 

---

This is a resubmission of v9.

This series has previously had 8 versions submitted and v8 was acked, after
which the series was dropped on the floor.

https://lkml.org/lkml/2012/4/6/255

changelog v9..v8:
1. rebase on v4.8-trunk.
2. change return type of save_dup_stack_address

changelog v8..v7:
1. rebase on v3.4-rc1.

changelog v7..v6:
1. rebase on v3.3_rc2, the only change is moving changes from kernel/sched.c
to kernel/sched/core.c

changelog v6..v5:
1. clear saved stack trace before printing a set of stacks. this ensures the 
printed
stack traces are not omitted messages.
2. add log level in printing duplicate stack.
3. remove the show_stack() API change, and non-x86 arch won't need further 
change.
4. add more inline documentations.

changelog v5..v4:
1. removed changes to Kconfig file
2. changed hashtable to keep only hash value and length of stack
3. simplified hashtable lookup

changelog v4..v3:
1. improve de-duplication by eliminating garbage entries from stack traces.
with this change 793/825 stack traces were recognized as duplicates. in v3
only 482/839 were duplicates.

changelog v3..v2:
1. again better documentation on the patch description.
2. make the stack_hash_table to be allocated at compile time.
3. have better name of variable index
4. move save_dup_stack_trace() in kernel/stacktrace.c

changelog v2..v1:
1. better documentation on the patch description
2. move the spinlock inside the hash lockup, so reducing the holding time.

 arch/x86/include/asm/stacktrace.h |  11 ++-
 arch/x86/kernel/dumpstack.c   |  12 ++--
 arch/x86/kernel/dumpstack_32.c|   7 +-
 arch/x86/kernel/dumpstack_64.c|   7 +-
 arch/x86/kernel/stacktrace.c  | 137 ++
 include/linux/stacktrace.h|   8 +++
 kernel/sched/core.c   |  33 -
 kernel/stacktrace.c   |  22 ++
 8 files changed, 222 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/stacktrace.h 
b/arch/x86/include/asm/stacktrace.h
index 0944218..d100e69 100644
--- a/arch/x86/include/asm/stacktrace.h
+++ b/arch/x86/include/asm/stacktrace.h
@@ -81,13 +81,20 @@ stack_frame(struct task_struct *task, struct pt_regs *regs)
 }
 #endif
 
+/*
+ * The parameter dup_stack_pid is used for task stack deduplication.
+ * The non-zero value of dup_stack_pid indicates the pid of the
+ * task with the same stack trace.
+ */
 extern void
 show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
-  unsigned long *stack, unsigned long bp, char *log_lvl);
+  unsigned long *stack, unsigned long bp, char *log_lvl,
+  pid_t dup_stack_pid);
 
 extern void
 show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs,
-  unsigned long *sp, unsigned long bp, char *log_lvl);
+  unsigned long *sp, unsigned long bp, char *log_lvl,
+  pid_t dup_stack_pid);
 
 extern unsigned int code_bytes;
 
diff --git

[PACTH v4 3/3] Documentation/filesystems: Added /proc/PID/totmaps documentation

2016-08-16 Thread robert . foss

From: Robert Foss <robert.f...@collabora.com>

Added documentation covering /proc/PID/totmaps.

Signed-off-by: Robert Foss <robert.f...@collabora.com>
---
 Documentation/filesystems/proc.txt | 21 +
 1 file changed, 21 insertions(+)

diff --git a/Documentation/filesystems/proc.txt 
b/Documentation/filesystems/proc.txt
index 7d001be..4cb97df 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -11,6 +11,7 @@ Version 1.3  
Kernel version 2.2.12
  Kernel version 2.4.0-test11-pre4
 --
 fixes/update part 1.1  Stefani Seibold <stef...@seibold.net>   June 9 2009
+add totmaps    Robert Foss <robert.f...@collabora.com>  August 12 2016
 
 Table of Contents
 -
@@ -147,6 +148,8 @@ Table 1-1: Process specific entries in /proc
  stack Report full stack trace, enable via CONFIG_STACKTRACE
  smaps an extension based on maps, showing the memory consumption of
each mapping and flags associated with it
+ totmapsan extension based on maps, showing the total memory
+consumption of all mappings
  numa_maps an extension based on maps, showing the memory locality and
binding policy as well as mem usage (in pages) of each mapping.
 ..
@@ -512,6 +515,24 @@ be vanished or the reverse -- new added.
 This file is only present if the CONFIG_MMU kernel configuration option is
 enabled.
 
+The /proc/PID/totmaps is an extension based on maps, showing the memory
+consumption totals for all of the process's mappings. It lists the sums of the
+same statistics as /proc/PID/smaps.
+
+The process' mappings will be summarized as a series of lines like the
+following:
+
+Rss:4256 kB
+Pss:1170 kB
+Shared_Clean:   2720 kB
+Shared_Dirty:   1136 kB
+Private_Clean: 0 kB
+Private_Dirty:   400 kB
+Referenced: 4256 kB
+Anonymous:  1536 kB
+AnonHugePages: 0 kB
+Swap:  0 kB
+
 The /proc/PID/clear_refs is used to reset the PG_Referenced and ACCESSED/YOUNG
 bits on both physical and virtual pages associated with a process, and the
 soft-dirty bit on pte (see Documentation/vm/soft-dirty.txt for details).
-- 
2.7.4

[PACTH v4 3/3] Documentation/filesystems: Added /proc/PID/totmaps documentation

2016-08-16 Thread robert . foss

From: Robert Foss 

Added documentation covering /proc/PID/totmaps.

Signed-off-by: Robert Foss 
---
 Documentation/filesystems/proc.txt | 21 +
 1 file changed, 21 insertions(+)

diff --git a/Documentation/filesystems/proc.txt 
b/Documentation/filesystems/proc.txt
index 7d001be..4cb97df 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -11,6 +11,7 @@ Version 1.3  
Kernel version 2.2.12
  Kernel version 2.4.0-test11-pre4
 --
 fixes/update part 1.1  Stefani SeiboldJune 9 2009
+add totmapsRobert Foss   August 12 2016
 
 Table of Contents
 -
@@ -147,6 +148,8 @@ Table 1-1: Process specific entries in /proc
  stack Report full stack trace, enable via CONFIG_STACKTRACE
  smaps an extension based on maps, showing the memory consumption of
each mapping and flags associated with it
+ totmapsan extension based on maps, showing the total memory
+consumption of all mappings
  numa_maps an extension based on maps, showing the memory locality and
binding policy as well as mem usage (in pages) of each mapping.
 ..
@@ -512,6 +515,24 @@ be vanished or the reverse -- new added.
 This file is only present if the CONFIG_MMU kernel configuration option is
 enabled.
 
+The /proc/PID/totmaps is an extension based on maps, showing the memory
+consumption totals for all of the process's mappings. It lists the sums of the
+same statistics as /proc/PID/smaps.
+
+The process' mappings will be summarized as a series of lines like the
+following:
+
+Rss:4256 kB
+Pss:1170 kB
+Shared_Clean:   2720 kB
+Shared_Dirty:   1136 kB
+Private_Clean: 0 kB
+Private_Dirty:   400 kB
+Referenced: 4256 kB
+Anonymous:  1536 kB
+AnonHugePages: 0 kB
+Swap:  0 kB
+
 The /proc/PID/clear_refs is used to reset the PG_Referenced and ACCESSED/YOUNG
 bits on both physical and virtual pages associated with a process, and the
 soft-dirty bit on pte (see Documentation/vm/soft-dirty.txt for details).
-- 
2.7.4

[PACTH v4 2/3] Documentation/filesystems: Fixed typo

2016-08-16 Thread robert . foss

From: Robert Foss <robert.f...@collabora.com>

Fixed a -> an typo.

Signed-off-by: Robert Foss <robert.f...@collabora.com>
---
 Documentation/filesystems/proc.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/filesystems/proc.txt 
b/Documentation/filesystems/proc.txt
index e8d0075..7d001be 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -145,7 +145,7 @@ Table 1-1: Process specific entries in /proc
symbol the task is blocked in - or "0" if not blocked.
  pagemap   Page table
  stack Report full stack trace, enable via CONFIG_STACKTRACE
- smaps a extension based on maps, showing the memory consumption of
+ smaps an extension based on maps, showing the memory consumption of
each mapping and flags associated with it
  numa_maps an extension based on maps, showing the memory locality and
binding policy as well as mem usage (in pages) of each mapping.
-- 
2.7.4

[PACTH v4 2/3] Documentation/filesystems: Fixed typo

2016-08-16 Thread robert . foss

From: Robert Foss 

Fixed a -> an typo.

Signed-off-by: Robert Foss 
---
 Documentation/filesystems/proc.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/filesystems/proc.txt 
b/Documentation/filesystems/proc.txt
index e8d0075..7d001be 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -145,7 +145,7 @@ Table 1-1: Process specific entries in /proc
symbol the task is blocked in - or "0" if not blocked.
  pagemap   Page table
  stack Report full stack trace, enable via CONFIG_STACKTRACE
- smaps a extension based on maps, showing the memory consumption of
+ smaps an extension based on maps, showing the memory consumption of
each mapping and flags associated with it
  numa_maps an extension based on maps, showing the memory locality and
binding policy as well as mem usage (in pages) of each mapping.
-- 
2.7.4

[PACTH v4 0/3] Implement /proc//totmaps

2016-08-16 Thread robert . foss

From: Robert Foss <robert.f...@collabora.com>


This series provides the /proc/PID/totmaps feature, which 
summarizes the information provided by /proc/PID/smaps for
improved performance and usability reasons.

A use case is to speed up monitoring of memory consumption in 
environments where RSS isn't precise.

For example Chrome tends to many processes which have hundreds of VMAs
with a substantial amount of shared memory, and the error of using
RSS rather than PSS tends to be very large when looking at overall
memory consumption.  PSS isn't kept as a single number that's exported
like RSS, so to calculate PSS means having to parse a very large smaps
file.

This process is slow and has to be repeated for many processes, and we
found that the just act of doing the parsing was taking up a
significant amount of CPU time, so this patch is an attempt to make
that process cheaper.

/proc/PID/totmaps provides roughly a 2x speedup compared to parsing
/proc/PID/smaps with awk.

$ ps aux | grep firefox
robertfoss   5025 24.3 13.7 3562820 2219616 ? Rl   Aug15 277:44 
/usr/lib/firefox/firefox https://allg.one/xpb
$ awk '/^[0-9a-f]/{print}' /proc/5025/smaps | wc -l
1503
$ /usr/bin/time -v -p zsh -c "(repeat 25 {cat /proc/5025/totmaps})"
[...]
Command being timed: "zsh -c (repeat 25 {cat /proc/5025/totmaps})"
User time (seconds): 0.00
System time (seconds): 0.40
Percent of CPU this job got: 90%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.45

$ /usr/bin/time -v -p zsh -c "repeat 25 { awk '/^Rss/{rss+=\$2} 
/^Pss/{pss+=\$2} END {printf \"rss:%d pss:%d\n\", rss, pss}\' /proc/5025/smaps 
}"
[...]
Command being timed: "zsh -c repeat 25 { awk '/^Rss/{rss+=$2} 
/^Pss/{pss+=$2} END {printf "rss:%d pss:%d\n", rss, pss}\' /proc/5025/smaps }"
User time (seconds): 0.37
System time (seconds): 0.45
Percent of CPU this job got: 92%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.89


Changes since v1:
- Removed IS_ERR check from get_task_mm() function
- Changed comment format
- Moved proc_totmaps_operations declaration inside internal.h
- Switched to using do_maps_open() in totmaps_open() function,
  which provides privilege checking
- Error handling reworked for totmaps_open() function
- Switched to stack allocated struct mem_size_stats mss_sum in
  totmaps_proc_show() function
- Removed get_task_mm() in totmaps_proc_show() since priv->mm
  already is available
- Added support to proc_map_release() fork priv==NULL, to allow
  function to be used for all failure cases
- Added proc_totmaps_op and for it helper functions
- Added documention in separate patch
- Removed totmaps_release() since it was just a wrapper for proc_map_release()

Changes since v2:
- Removed struct mem_size_stats *mss from struct proc_maps_private
- Removed priv->task assignment in totmaps_open() call
- Moved some assignements calls totmaps_open() around to increase code
  clarity
- Moved some function calls to unlock data structures before printing

Changes since v3:
- Fixed typo in totmaps documentation
- Fixed issue where proc_map_release wasn't called on error
- Fixed put_task_struct not being called during .release()

Robert Foss (3):
  mm, proc: Implement /proc//totmaps
  Documentation/filesystems: Fixed typo
  Documentation/filesystems: Added /proc/PID/totmaps documentation

 Documentation/filesystems/proc.txt |  23 +-
 fs/proc/base.c |   1 +
 fs/proc/internal.h |   2 +
 fs/proc/task_mmu.c | 141 +
 4 files changed, 166 insertions(+), 1 deletion(-)

-- 
2.7.4

[PACTH v4 0/3] Implement /proc//totmaps

2016-08-16 Thread robert . foss

From: Robert Foss 


This series provides the /proc/PID/totmaps feature, which 
summarizes the information provided by /proc/PID/smaps for
improved performance and usability reasons.

A use case is to speed up monitoring of memory consumption in 
environments where RSS isn't precise.

For example Chrome tends to many processes which have hundreds of VMAs
with a substantial amount of shared memory, and the error of using
RSS rather than PSS tends to be very large when looking at overall
memory consumption.  PSS isn't kept as a single number that's exported
like RSS, so to calculate PSS means having to parse a very large smaps
file.

This process is slow and has to be repeated for many processes, and we
found that the just act of doing the parsing was taking up a
significant amount of CPU time, so this patch is an attempt to make
that process cheaper.

/proc/PID/totmaps provides roughly a 2x speedup compared to parsing
/proc/PID/smaps with awk.

$ ps aux | grep firefox
robertfoss   5025 24.3 13.7 3562820 2219616 ? Rl   Aug15 277:44 
/usr/lib/firefox/firefox https://allg.one/xpb
$ awk '/^[0-9a-f]/{print}' /proc/5025/smaps | wc -l
1503
$ /usr/bin/time -v -p zsh -c "(repeat 25 {cat /proc/5025/totmaps})"
[...]
Command being timed: "zsh -c (repeat 25 {cat /proc/5025/totmaps})"
User time (seconds): 0.00
System time (seconds): 0.40
Percent of CPU this job got: 90%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.45

$ /usr/bin/time -v -p zsh -c "repeat 25 { awk '/^Rss/{rss+=\$2} 
/^Pss/{pss+=\$2} END {printf \"rss:%d pss:%d\n\", rss, pss}\' /proc/5025/smaps 
}"
[...]
Command being timed: "zsh -c repeat 25 { awk '/^Rss/{rss+=$2} 
/^Pss/{pss+=$2} END {printf "rss:%d pss:%d\n", rss, pss}\' /proc/5025/smaps }"
User time (seconds): 0.37
System time (seconds): 0.45
Percent of CPU this job got: 92%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.89


Changes since v1:
- Removed IS_ERR check from get_task_mm() function
- Changed comment format
- Moved proc_totmaps_operations declaration inside internal.h
- Switched to using do_maps_open() in totmaps_open() function,
  which provides privilege checking
- Error handling reworked for totmaps_open() function
- Switched to stack allocated struct mem_size_stats mss_sum in
  totmaps_proc_show() function
- Removed get_task_mm() in totmaps_proc_show() since priv->mm
  already is available
- Added support to proc_map_release() fork priv==NULL, to allow
  function to be used for all failure cases
- Added proc_totmaps_op and for it helper functions
- Added documention in separate patch
- Removed totmaps_release() since it was just a wrapper for proc_map_release()

Changes since v2:
- Removed struct mem_size_stats *mss from struct proc_maps_private
- Removed priv->task assignment in totmaps_open() call
- Moved some assignements calls totmaps_open() around to increase code
  clarity
- Moved some function calls to unlock data structures before printing

Changes since v3:
- Fixed typo in totmaps documentation
- Fixed issue where proc_map_release wasn't called on error
- Fixed put_task_struct not being called during .release()

Robert Foss (3):
  mm, proc: Implement /proc//totmaps
  Documentation/filesystems: Fixed typo
  Documentation/filesystems: Added /proc/PID/totmaps documentation

 Documentation/filesystems/proc.txt |  23 +-
 fs/proc/base.c |   1 +
 fs/proc/internal.h |   2 +
 fs/proc/task_mmu.c | 141 +
 4 files changed, 166 insertions(+), 1 deletion(-)

-- 
2.7.4

[PACTH v4 1/3] mm, proc: Implement /proc//totmaps

2016-08-16 Thread robert . foss

From: Robert Foss <robert.f...@collabora.com>

This is based on earlier work by Thiago Goncales. It implements a new
per process proc file which summarizes the contents of the smaps file
but doesn't display any addresses.  It gives more detailed information
than statm like the PSS (proprotional set size).  It differs from the
original implementation in that it doesn't use the full blown set of
seq operations, uses a different termination condition, and doesn't
displayed "Locked" as that was broken on the original implemenation.

This new proc file provides information faster than parsing the potentially
huge smaps file.

Tested-by: Robert Foss <robert.f...@collabora.com>
Signed-off-by: Robert Foss <robert.f...@collabora.com>

Signed-off-by: Sonny Rao <sonny...@chromium.org>
---
 fs/proc/base.c |   1 +
 fs/proc/internal.h |   2 +
 fs/proc/task_mmu.c | 141 +
 3 files changed, 144 insertions(+)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index a11eb71..de3acdf 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2855,6 +2855,7 @@ static const struct pid_entry tgid_base_stuff[] = {
REG("clear_refs", S_IWUSR, proc_clear_refs_operations),
REG("smaps",  S_IRUGO, proc_pid_smaps_operations),
REG("pagemap",S_IRUSR, proc_pagemap_operations),
+   REG("totmaps",S_IRUGO, proc_totmaps_operations),
 #endif
 #ifdef CONFIG_SECURITY
DIR("attr",   S_IRUGO|S_IXUGO, proc_attr_dir_inode_operations, 
proc_attr_dir_operations),
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index aa27810..99f97d7 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -297,6 +297,8 @@ extern const struct file_operations 
proc_pid_smaps_operations;
 extern const struct file_operations proc_tid_smaps_operations;
 extern const struct file_operations proc_clear_refs_operations;
 extern const struct file_operations proc_pagemap_operations;
+extern const struct file_operations proc_totmaps_operations;
+
 
 extern unsigned long task_vsize(struct mm_struct *);
 extern unsigned long task_statm(struct mm_struct *,
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 4648c7f..fd8fd7f 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -802,6 +802,75 @@ static int show_smap(struct seq_file *m, void *v, int 
is_pid)
return 0;
 }
 
+static void add_smaps_sum(struct mem_size_stats *mss,
+   struct mem_size_stats *mss_sum)
+{
+   mss_sum->resident += mss->resident;
+   mss_sum->pss += mss->pss;
+   mss_sum->shared_clean += mss->shared_clean;
+   mss_sum->shared_dirty += mss->shared_dirty;
+   mss_sum->private_clean += mss->private_clean;
+   mss_sum->private_dirty += mss->private_dirty;
+   mss_sum->referenced += mss->referenced;
+   mss_sum->anonymous += mss->anonymous;
+   mss_sum->anonymous_thp += mss->anonymous_thp;
+   mss_sum->swap += mss->swap;
+}
+
+static int totmaps_proc_show(struct seq_file *m, void *data)
+{
+   struct proc_maps_private *priv = m->private;
+   struct mm_struct *mm = priv->mm;
+   struct vm_area_struct *vma;
+   struct mem_size_stats mss_sum;
+
+   memset(_sum, 0, sizeof(mss_sum));
+   down_read(>mmap_sem);
+   hold_task_mempolicy(priv);
+
+   for (vma = mm->mmap; vma != priv->tail_vma; vma = vma->vm_next) {
+   struct mem_size_stats mss;
+   struct mm_walk smaps_walk = {
+   .pmd_entry = smaps_pte_range,
+   .mm = vma->vm_mm,
+   .private = ,
+   };
+
+   if (vma->vm_mm && !is_vm_hugetlb_page(vma)) {
+   memset(, 0, sizeof(mss));
+   walk_page_vma(vma, _walk);
+   add_smaps_sum(, _sum);
+   }
+   }
+
+   release_task_mempolicy(priv);
+   up_read(>mmap_sem);
+
+   seq_printf(m,
+  "Rss:%8lu kB\n"
+  "Pss:%8lu kB\n"
+  "Shared_Clean:   %8lu kB\n"
+  "Shared_Dirty:   %8lu kB\n"
+  "Private_Clean:  %8lu kB\n"
+  "Private_Dirty:  %8lu kB\n"
+  "Referenced: %8lu kB\n"
+  "Anonymous:  %8lu kB\n"
+  "AnonHugePages:  %8lu kB\n"
+  "Swap:   %8lu kB\n",
+  mss_sum.resident >> 10,
+  (unsigned long)(mss_sum.pss >> (10 + PSS_SHIFT)),
+  mss_sum.shared_clean  >> 10,
+  mss_sum.shared_dirty  >> 10,
+  mss_sum.private_clean >> 10,
+  mss_su

[PACTH v4 1/3] mm, proc: Implement /proc//totmaps

2016-08-16 Thread robert . foss

From: Robert Foss 

This is based on earlier work by Thiago Goncales. It implements a new
per process proc file which summarizes the contents of the smaps file
but doesn't display any addresses.  It gives more detailed information
than statm like the PSS (proprotional set size).  It differs from the
original implementation in that it doesn't use the full blown set of
seq operations, uses a different termination condition, and doesn't
displayed "Locked" as that was broken on the original implemenation.

This new proc file provides information faster than parsing the potentially
huge smaps file.

Tested-by: Robert Foss 
Signed-off-by: Robert Foss 

Signed-off-by: Sonny Rao 
---
 fs/proc/base.c |   1 +
 fs/proc/internal.h |   2 +
 fs/proc/task_mmu.c | 141 +
 3 files changed, 144 insertions(+)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index a11eb71..de3acdf 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2855,6 +2855,7 @@ static const struct pid_entry tgid_base_stuff[] = {
REG("clear_refs", S_IWUSR, proc_clear_refs_operations),
REG("smaps",  S_IRUGO, proc_pid_smaps_operations),
REG("pagemap",S_IRUSR, proc_pagemap_operations),
+   REG("totmaps",S_IRUGO, proc_totmaps_operations),
 #endif
 #ifdef CONFIG_SECURITY
DIR("attr",   S_IRUGO|S_IXUGO, proc_attr_dir_inode_operations, 
proc_attr_dir_operations),
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index aa27810..99f97d7 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -297,6 +297,8 @@ extern const struct file_operations 
proc_pid_smaps_operations;
 extern const struct file_operations proc_tid_smaps_operations;
 extern const struct file_operations proc_clear_refs_operations;
 extern const struct file_operations proc_pagemap_operations;
+extern const struct file_operations proc_totmaps_operations;
+
 
 extern unsigned long task_vsize(struct mm_struct *);
 extern unsigned long task_statm(struct mm_struct *,
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 4648c7f..fd8fd7f 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -802,6 +802,75 @@ static int show_smap(struct seq_file *m, void *v, int 
is_pid)
return 0;
 }
 
+static void add_smaps_sum(struct mem_size_stats *mss,
+   struct mem_size_stats *mss_sum)
+{
+   mss_sum->resident += mss->resident;
+   mss_sum->pss += mss->pss;
+   mss_sum->shared_clean += mss->shared_clean;
+   mss_sum->shared_dirty += mss->shared_dirty;
+   mss_sum->private_clean += mss->private_clean;
+   mss_sum->private_dirty += mss->private_dirty;
+   mss_sum->referenced += mss->referenced;
+   mss_sum->anonymous += mss->anonymous;
+   mss_sum->anonymous_thp += mss->anonymous_thp;
+   mss_sum->swap += mss->swap;
+}
+
+static int totmaps_proc_show(struct seq_file *m, void *data)
+{
+   struct proc_maps_private *priv = m->private;
+   struct mm_struct *mm = priv->mm;
+   struct vm_area_struct *vma;
+   struct mem_size_stats mss_sum;
+
+   memset(_sum, 0, sizeof(mss_sum));
+   down_read(>mmap_sem);
+   hold_task_mempolicy(priv);
+
+   for (vma = mm->mmap; vma != priv->tail_vma; vma = vma->vm_next) {
+   struct mem_size_stats mss;
+   struct mm_walk smaps_walk = {
+   .pmd_entry = smaps_pte_range,
+   .mm = vma->vm_mm,
+   .private = ,
+   };
+
+   if (vma->vm_mm && !is_vm_hugetlb_page(vma)) {
+   memset(, 0, sizeof(mss));
+   walk_page_vma(vma, _walk);
+   add_smaps_sum(, _sum);
+   }
+   }
+
+   release_task_mempolicy(priv);
+   up_read(>mmap_sem);
+
+   seq_printf(m,
+  "Rss:%8lu kB\n"
+  "Pss:%8lu kB\n"
+  "Shared_Clean:   %8lu kB\n"
+  "Shared_Dirty:   %8lu kB\n"
+  "Private_Clean:  %8lu kB\n"
+  "Private_Dirty:  %8lu kB\n"
+  "Referenced: %8lu kB\n"
+  "Anonymous:  %8lu kB\n"
+  "AnonHugePages:  %8lu kB\n"
+  "Swap:   %8lu kB\n",
+  mss_sum.resident >> 10,
+  (unsigned long)(mss_sum.pss >> (10 + PSS_SHIFT)),
+  mss_sum.shared_clean  >> 10,
+  mss_sum.shared_dirty  >> 10,
+  mss_sum.private_clean >> 10,
+  mss_sum.private_dirty >> 10,
+  mss_sum.referenced >> 10,
+  mss_sum.anonymous >&

[PACTH v3] mmc: sdhci: Do not allow tuning procedure to be interrupted

2016-08-16 Thread robert . foss

From: Christopher Freeman <cfree...@nvidia.com>

wait_event_interruptible_timeout() will return early if the blocked
process receives a signal, causing the driver to abort the tuning
procedure and possibly leaving the controller in a bad state.  Since the
tuning command is expected to complete quickly (<50ms) and we've set a
timeout, use wait_event_timeout() instead.

Signed-off-by: Christopher Freeman <cfree...@nvidia.com>
Tested-by: Robert Foss <robert.f...@collabora.com>
Signed-off-by: Robert Foss <robert.f...@collabora.com>
Reviewed-by: Benson Leung <ble...@chromium.org>
---
 drivers/mmc/host/sdhci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 0e3d7c0..9e80203 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -1960,7 +1960,7 @@ static int sdhci_execute_tuning(struct mmc_host *mmc, u32 
opcode)
 
spin_unlock_irqrestore(>lock, flags);
/* Wait for Buffer Read Ready interrupt */
-   wait_event_interruptible_timeout(host->buf_ready_int,
+   wait_event_timeout(host->buf_ready_int,
(host->tuning_done == 1),
msecs_to_jiffies(50));
spin_lock_irqsave(>lock, flags);
-- 
2.7.4

[PACTH v3] mmc: sdhci: Do not allow tuning procedure to be interrupted

2016-08-16 Thread robert . foss

From: Christopher Freeman 

wait_event_interruptible_timeout() will return early if the blocked
process receives a signal, causing the driver to abort the tuning
procedure and possibly leaving the controller in a bad state.  Since the
tuning command is expected to complete quickly (<50ms) and we've set a
timeout, use wait_event_timeout() instead.

Signed-off-by: Christopher Freeman 
Tested-by: Robert Foss 
Signed-off-by: Robert Foss 
Reviewed-by: Benson Leung 
---
 drivers/mmc/host/sdhci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 0e3d7c0..9e80203 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -1960,7 +1960,7 @@ static int sdhci_execute_tuning(struct mmc_host *mmc, u32 
opcode)
 
spin_unlock_irqrestore(>lock, flags);
/* Wait for Buffer Read Ready interrupt */
-   wait_event_interruptible_timeout(host->buf_ready_int,
+   wait_event_timeout(host->buf_ready_int,
(host->tuning_done == 1),
msecs_to_jiffies(50));
spin_lock_irqsave(>lock, flags);
-- 
2.7.4

Re: [PACTH v3 1/3] mm, proc: Implement /proc//totmaps

2016-08-16 Thread Robert Foss




On 2016-08-16 02:18 PM, Jann Horn wrote:

On Tue, Aug 16, 2016 at 01:34:14PM -0400, robert.f...@collabora.com wrote:

From: Robert Foss <robert.f...@collabora.com>

This is based on earlier work by Thiago Goncales. It implements a new
per process proc file which summarizes the contents of the smaps file
but doesn't display any addresses.  It gives more detailed information
than statm like the PSS (proprotional set size).  It differs from the
original implementation in that it doesn't use the full blown set of
seq operations, uses a different termination condition, and doesn't
displayed "Locked" as that was broken on the original implemenation.

This new proc file provides information faster than parsing the potentially
huge smaps file.

Tested-by: Robert Foss <robert.f...@collabora.com>
Signed-off-by: Robert Foss <robert.f...@collabora.com>

Signed-off-by: Sonny Rao <sonny...@chromium.org>
---

[...]

+static int totmaps_open(struct inode *inode, struct file *file)
+{
+   struct proc_maps_private *priv = NULL;
+   struct seq_file *seq;
+   int ret;
+
+   ret = do_maps_open(inode, file, _totmaps_op);
+   if (ret)
+   goto error;
+
+   /*
+* We need to grab references to the task_struct
+* at open time, because there's a potential information
+* leak where the totmaps file is opened and held open
+* while the underlying pid to task mapping changes
+* underneath it
+*/
+   seq = file->private_data;
+   priv = seq->private;
+   priv->task = get_proc_task(inode);
+   if (!priv->task) {
+   ret = -ESRCH;
+   goto error;


I see that you removed the proc_map_release() call for the upper
error case as I recommended. However, for the second error case,
you do have to call it because do_maps_open() succeeded.

You could fix this by turning the first "goto error;" into
"return;" and adding the proc_map_release() call back in after
the "error:" label. This would be fine - if an error branch just
needs to return an error code, it's okay to do so directly
without jumping to an error label.

Alternatively, you could add a second label
in front of the existing "error:" label, jump to the new label
for the second error case, and call proc_map_release() between
the new label and the old one.


Ah, naturally. Thanks for the patience and advice!





+   }
+
+   return 0;
+
+error:
+   return ret;
+}
+

[...]

+const struct file_operations proc_totmaps_operations = {
+   .open   = totmaps_open,
+   .read   = seq_read,
+   .llseek = seq_lseek,
+   .release= proc_map_release,
+};


As I said regarding v2 already:
This won't release priv->task, causing a memory leak (exploitable
through a reference counter overflow of the task_struct usage
counter).



Sorry about dropping the ball on that one, what's correct way to release 
priv->task?

Re: [PACTH v3 1/3] mm, proc: Implement /proc//totmaps

2016-08-16 Thread Robert Foss




On 2016-08-16 02:18 PM, Jann Horn wrote:

On Tue, Aug 16, 2016 at 01:34:14PM -0400, robert.f...@collabora.com wrote:

From: Robert Foss 

This is based on earlier work by Thiago Goncales. It implements a new
per process proc file which summarizes the contents of the smaps file
but doesn't display any addresses.  It gives more detailed information
than statm like the PSS (proprotional set size).  It differs from the
original implementation in that it doesn't use the full blown set of
seq operations, uses a different termination condition, and doesn't
displayed "Locked" as that was broken on the original implemenation.

This new proc file provides information faster than parsing the potentially
huge smaps file.

Tested-by: Robert Foss 
Signed-off-by: Robert Foss 

Signed-off-by: Sonny Rao 
---

[...]

+static int totmaps_open(struct inode *inode, struct file *file)
+{
+   struct proc_maps_private *priv = NULL;
+   struct seq_file *seq;
+   int ret;
+
+   ret = do_maps_open(inode, file, _totmaps_op);
+   if (ret)
+   goto error;
+
+   /*
+* We need to grab references to the task_struct
+* at open time, because there's a potential information
+* leak where the totmaps file is opened and held open
+* while the underlying pid to task mapping changes
+* underneath it
+*/
+   seq = file->private_data;
+   priv = seq->private;
+   priv->task = get_proc_task(inode);
+   if (!priv->task) {
+   ret = -ESRCH;
+   goto error;


I see that you removed the proc_map_release() call for the upper
error case as I recommended. However, for the second error case,
you do have to call it because do_maps_open() succeeded.

You could fix this by turning the first "goto error;" into
"return;" and adding the proc_map_release() call back in after
the "error:" label. This would be fine - if an error branch just
needs to return an error code, it's okay to do so directly
without jumping to an error label.

Alternatively, you could add a second label
in front of the existing "error:" label, jump to the new label
for the second error case, and call proc_map_release() between
the new label and the old one.


Ah, naturally. Thanks for the patience and advice!





+   }
+
+   return 0;
+
+error:
+   return ret;
+}
+

[...]

+const struct file_operations proc_totmaps_operations = {
+   .open   = totmaps_open,
+   .read   = seq_read,
+   .llseek = seq_lseek,
+   .release= proc_map_release,
+};


As I said regarding v2 already:
This won't release priv->task, causing a memory leak (exploitable
through a reference counter overflow of the task_struct usage
counter).



Sorry about dropping the ball on that one, what's correct way to release 
priv->task?

Re: [PACTH v3 3/3] Documentation/filesystems: Added /proc/PID/totmaps documentation

2016-08-16 Thread Robert Foss




On 2016-08-16 02:01 PM, Jann Horn wrote:

nit: s/extenssion/extension/


Thanks :)

Re: [PACTH v3 3/3] Documentation/filesystems: Added /proc/PID/totmaps documentation

2016-08-16 Thread Robert Foss




On 2016-08-16 02:01 PM, Jann Horn wrote:

nit: s/extenssion/extension/


Thanks :)

[PACTH v3 3/3] Documentation/filesystems: Added /proc/PID/totmaps documentation

2016-08-16 Thread robert . foss

From: Robert Foss <robert.f...@collabora.com>

Added documentation covering /proc/PID/totmaps.

Signed-off-by: Robert Foss <robert.f...@collabora.com>
---
 Documentation/filesystems/proc.txt | 21 +
 1 file changed, 21 insertions(+)

diff --git a/Documentation/filesystems/proc.txt 
b/Documentation/filesystems/proc.txt
index 7d001be..c06ff33 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -11,6 +11,7 @@ Version 1.3  
Kernel version 2.2.12
  Kernel version 2.4.0-test11-pre4
 --
 fixes/update part 1.1  Stefani Seibold <stef...@seibold.net>   June 9 2009
+add totmaps    Robert Foss <robert.f...@collabora.com>  August 12 2016
 
 Table of Contents
 -
@@ -147,6 +148,8 @@ Table 1-1: Process specific entries in /proc
  stack Report full stack trace, enable via CONFIG_STACKTRACE
  smaps an extension based on maps, showing the memory consumption of
each mapping and flags associated with it
+ totmapsan extenssion based on maps, showing the total memory
+consumption of all mappings
  numa_maps an extension based on maps, showing the memory locality and
binding policy as well as mem usage (in pages) of each mapping.
 ..
@@ -512,6 +515,24 @@ be vanished or the reverse -- new added.
 This file is only present if the CONFIG_MMU kernel configuration option is
 enabled.
 
+The /proc/PID/totmaps is an extension based on maps, showing the memory
+consumption totals for all of the process's mappings. It lists the sums of the
+same statistics as /proc/PID/smaps.
+
+The process' mappings will be summarized as a series of lines like the
+following:
+
+Rss:4256 kB
+Pss:1170 kB
+Shared_Clean:   2720 kB
+Shared_Dirty:   1136 kB
+Private_Clean: 0 kB
+Private_Dirty:   400 kB
+Referenced: 4256 kB
+Anonymous:  1536 kB
+AnonHugePages: 0 kB
+Swap:  0 kB
+
 The /proc/PID/clear_refs is used to reset the PG_Referenced and ACCESSED/YOUNG
 bits on both physical and virtual pages associated with a process, and the
 soft-dirty bit on pte (see Documentation/vm/soft-dirty.txt for details).
-- 
2.7.4

[PACTH v3 2/3] Documentation/filesystems: Fixed typo

2016-08-16 Thread robert . foss

From: Robert Foss <robert.f...@collabora.com>

Fixed a -> an typo.

Signed-off-by: Robert Foss <robert.f...@collabora.com>
---
 Documentation/filesystems/proc.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/filesystems/proc.txt 
b/Documentation/filesystems/proc.txt
index e8d0075..7d001be 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -145,7 +145,7 @@ Table 1-1: Process specific entries in /proc
symbol the task is blocked in - or "0" if not blocked.
  pagemap   Page table
  stack Report full stack trace, enable via CONFIG_STACKTRACE
- smaps a extension based on maps, showing the memory consumption of
+ smaps an extension based on maps, showing the memory consumption of
each mapping and flags associated with it
  numa_maps an extension based on maps, showing the memory locality and
binding policy as well as mem usage (in pages) of each mapping.
-- 
2.7.4

[PACTH v3 3/3] Documentation/filesystems: Added /proc/PID/totmaps documentation

2016-08-16 Thread robert . foss

From: Robert Foss 

Added documentation covering /proc/PID/totmaps.

Signed-off-by: Robert Foss 
---
 Documentation/filesystems/proc.txt | 21 +
 1 file changed, 21 insertions(+)

diff --git a/Documentation/filesystems/proc.txt 
b/Documentation/filesystems/proc.txt
index 7d001be..c06ff33 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -11,6 +11,7 @@ Version 1.3  
Kernel version 2.2.12
  Kernel version 2.4.0-test11-pre4
 --
 fixes/update part 1.1  Stefani SeiboldJune 9 2009
+add totmapsRobert Foss   August 12 2016
 
 Table of Contents
 -
@@ -147,6 +148,8 @@ Table 1-1: Process specific entries in /proc
  stack Report full stack trace, enable via CONFIG_STACKTRACE
  smaps an extension based on maps, showing the memory consumption of
each mapping and flags associated with it
+ totmapsan extenssion based on maps, showing the total memory
+consumption of all mappings
  numa_maps an extension based on maps, showing the memory locality and
binding policy as well as mem usage (in pages) of each mapping.
 ..
@@ -512,6 +515,24 @@ be vanished or the reverse -- new added.
 This file is only present if the CONFIG_MMU kernel configuration option is
 enabled.
 
+The /proc/PID/totmaps is an extension based on maps, showing the memory
+consumption totals for all of the process's mappings. It lists the sums of the
+same statistics as /proc/PID/smaps.
+
+The process' mappings will be summarized as a series of lines like the
+following:
+
+Rss:4256 kB
+Pss:1170 kB
+Shared_Clean:   2720 kB
+Shared_Dirty:   1136 kB
+Private_Clean: 0 kB
+Private_Dirty:   400 kB
+Referenced: 4256 kB
+Anonymous:  1536 kB
+AnonHugePages: 0 kB
+Swap:  0 kB
+
 The /proc/PID/clear_refs is used to reset the PG_Referenced and ACCESSED/YOUNG
 bits on both physical and virtual pages associated with a process, and the
 soft-dirty bit on pte (see Documentation/vm/soft-dirty.txt for details).
-- 
2.7.4

[PACTH v3 2/3] Documentation/filesystems: Fixed typo

2016-08-16 Thread robert . foss

From: Robert Foss 

Fixed a -> an typo.

Signed-off-by: Robert Foss 
---
 Documentation/filesystems/proc.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/filesystems/proc.txt 
b/Documentation/filesystems/proc.txt
index e8d0075..7d001be 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -145,7 +145,7 @@ Table 1-1: Process specific entries in /proc
symbol the task is blocked in - or "0" if not blocked.
  pagemap   Page table
  stack Report full stack trace, enable via CONFIG_STACKTRACE
- smaps a extension based on maps, showing the memory consumption of
+ smaps an extension based on maps, showing the memory consumption of
each mapping and flags associated with it
  numa_maps an extension based on maps, showing the memory locality and
binding policy as well as mem usage (in pages) of each mapping.
-- 
2.7.4

[PACTH v3 1/3] mm, proc: Implement /proc//totmaps

2016-08-16 Thread robert . foss

From: Robert Foss <robert.f...@collabora.com>

This is based on earlier work by Thiago Goncales. It implements a new
per process proc file which summarizes the contents of the smaps file
but doesn't display any addresses.  It gives more detailed information
than statm like the PSS (proprotional set size).  It differs from the
original implementation in that it doesn't use the full blown set of
seq operations, uses a different termination condition, and doesn't
displayed "Locked" as that was broken on the original implemenation.

This new proc file provides information faster than parsing the potentially
huge smaps file.

Tested-by: Robert Foss <robert.f...@collabora.com>
Signed-off-by: Robert Foss <robert.f...@collabora.com>

Signed-off-by: Sonny Rao <sonny...@chromium.org>
---
 fs/proc/base.c |   1 +
 fs/proc/internal.h |   2 +
 fs/proc/task_mmu.c | 129 +
 3 files changed, 132 insertions(+)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index a11eb71..de3acdf 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2855,6 +2855,7 @@ static const struct pid_entry tgid_base_stuff[] = {
REG("clear_refs", S_IWUSR, proc_clear_refs_operations),
REG("smaps",  S_IRUGO, proc_pid_smaps_operations),
REG("pagemap",S_IRUSR, proc_pagemap_operations),
+   REG("totmaps",S_IRUGO, proc_totmaps_operations),
 #endif
 #ifdef CONFIG_SECURITY
DIR("attr",   S_IRUGO|S_IXUGO, proc_attr_dir_inode_operations, 
proc_attr_dir_operations),
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index aa27810..99f97d7 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -297,6 +297,8 @@ extern const struct file_operations 
proc_pid_smaps_operations;
 extern const struct file_operations proc_tid_smaps_operations;
 extern const struct file_operations proc_clear_refs_operations;
 extern const struct file_operations proc_pagemap_operations;
+extern const struct file_operations proc_totmaps_operations;
+
 
 extern unsigned long task_vsize(struct mm_struct *);
 extern unsigned long task_statm(struct mm_struct *,
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 4648c7f..fe692cb 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -802,6 +802,75 @@ static int show_smap(struct seq_file *m, void *v, int 
is_pid)
return 0;
 }
 
+static void add_smaps_sum(struct mem_size_stats *mss,
+   struct mem_size_stats *mss_sum)
+{
+   mss_sum->resident += mss->resident;
+   mss_sum->pss += mss->pss;
+   mss_sum->shared_clean += mss->shared_clean;
+   mss_sum->shared_dirty += mss->shared_dirty;
+   mss_sum->private_clean += mss->private_clean;
+   mss_sum->private_dirty += mss->private_dirty;
+   mss_sum->referenced += mss->referenced;
+   mss_sum->anonymous += mss->anonymous;
+   mss_sum->anonymous_thp += mss->anonymous_thp;
+   mss_sum->swap += mss->swap;
+}
+
+static int totmaps_proc_show(struct seq_file *m, void *data)
+{
+   struct proc_maps_private *priv = m->private;
+   struct mm_struct *mm = priv->mm;
+   struct vm_area_struct *vma;
+   struct mem_size_stats mss_sum;
+
+   memset(_sum, 0, sizeof(mss_sum));
+   down_read(>mmap_sem);
+   hold_task_mempolicy(priv);
+
+   for (vma = mm->mmap; vma != priv->tail_vma; vma = vma->vm_next) {
+   struct mem_size_stats mss;
+   struct mm_walk smaps_walk = {
+   .pmd_entry = smaps_pte_range,
+   .mm = vma->vm_mm,
+   .private = ,
+   };
+
+   if (vma->vm_mm && !is_vm_hugetlb_page(vma)) {
+   memset(, 0, sizeof(mss));
+   walk_page_vma(vma, _walk);
+   add_smaps_sum(, _sum);
+   }
+   }
+
+   release_task_mempolicy(priv);
+   up_read(>mmap_sem);
+
+   seq_printf(m,
+  "Rss:%8lu kB\n"
+  "Pss:%8lu kB\n"
+  "Shared_Clean:   %8lu kB\n"
+  "Shared_Dirty:   %8lu kB\n"
+  "Private_Clean:  %8lu kB\n"
+  "Private_Dirty:  %8lu kB\n"
+  "Referenced: %8lu kB\n"
+  "Anonymous:  %8lu kB\n"
+  "AnonHugePages:  %8lu kB\n"
+  "Swap:   %8lu kB\n",
+  mss_sum.resident >> 10,
+  (unsigned long)(mss_sum.pss >> (10 + PSS_SHIFT)),
+  mss_sum.shared_clean  >> 10,
+  mss_sum.shared_dirty  >> 10,
+  mss_sum.private_clean >> 10,
+  mss_su

[PACTH v3 1/3] mm, proc: Implement /proc//totmaps

2016-08-16 Thread robert . foss

From: Robert Foss 

This is based on earlier work by Thiago Goncales. It implements a new
per process proc file which summarizes the contents of the smaps file
but doesn't display any addresses.  It gives more detailed information
than statm like the PSS (proprotional set size).  It differs from the
original implementation in that it doesn't use the full blown set of
seq operations, uses a different termination condition, and doesn't
displayed "Locked" as that was broken on the original implemenation.

This new proc file provides information faster than parsing the potentially
huge smaps file.

Tested-by: Robert Foss 
Signed-off-by: Robert Foss 

Signed-off-by: Sonny Rao 
---
 fs/proc/base.c |   1 +
 fs/proc/internal.h |   2 +
 fs/proc/task_mmu.c | 129 +
 3 files changed, 132 insertions(+)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index a11eb71..de3acdf 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2855,6 +2855,7 @@ static const struct pid_entry tgid_base_stuff[] = {
REG("clear_refs", S_IWUSR, proc_clear_refs_operations),
REG("smaps",  S_IRUGO, proc_pid_smaps_operations),
REG("pagemap",S_IRUSR, proc_pagemap_operations),
+   REG("totmaps",S_IRUGO, proc_totmaps_operations),
 #endif
 #ifdef CONFIG_SECURITY
DIR("attr",   S_IRUGO|S_IXUGO, proc_attr_dir_inode_operations, 
proc_attr_dir_operations),
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index aa27810..99f97d7 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -297,6 +297,8 @@ extern const struct file_operations 
proc_pid_smaps_operations;
 extern const struct file_operations proc_tid_smaps_operations;
 extern const struct file_operations proc_clear_refs_operations;
 extern const struct file_operations proc_pagemap_operations;
+extern const struct file_operations proc_totmaps_operations;
+
 
 extern unsigned long task_vsize(struct mm_struct *);
 extern unsigned long task_statm(struct mm_struct *,
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 4648c7f..fe692cb 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -802,6 +802,75 @@ static int show_smap(struct seq_file *m, void *v, int 
is_pid)
return 0;
 }
 
+static void add_smaps_sum(struct mem_size_stats *mss,
+   struct mem_size_stats *mss_sum)
+{
+   mss_sum->resident += mss->resident;
+   mss_sum->pss += mss->pss;
+   mss_sum->shared_clean += mss->shared_clean;
+   mss_sum->shared_dirty += mss->shared_dirty;
+   mss_sum->private_clean += mss->private_clean;
+   mss_sum->private_dirty += mss->private_dirty;
+   mss_sum->referenced += mss->referenced;
+   mss_sum->anonymous += mss->anonymous;
+   mss_sum->anonymous_thp += mss->anonymous_thp;
+   mss_sum->swap += mss->swap;
+}
+
+static int totmaps_proc_show(struct seq_file *m, void *data)
+{
+   struct proc_maps_private *priv = m->private;
+   struct mm_struct *mm = priv->mm;
+   struct vm_area_struct *vma;
+   struct mem_size_stats mss_sum;
+
+   memset(_sum, 0, sizeof(mss_sum));
+   down_read(>mmap_sem);
+   hold_task_mempolicy(priv);
+
+   for (vma = mm->mmap; vma != priv->tail_vma; vma = vma->vm_next) {
+   struct mem_size_stats mss;
+   struct mm_walk smaps_walk = {
+   .pmd_entry = smaps_pte_range,
+   .mm = vma->vm_mm,
+   .private = ,
+   };
+
+   if (vma->vm_mm && !is_vm_hugetlb_page(vma)) {
+   memset(, 0, sizeof(mss));
+   walk_page_vma(vma, _walk);
+   add_smaps_sum(, _sum);
+   }
+   }
+
+   release_task_mempolicy(priv);
+   up_read(>mmap_sem);
+
+   seq_printf(m,
+  "Rss:%8lu kB\n"
+  "Pss:%8lu kB\n"
+  "Shared_Clean:   %8lu kB\n"
+  "Shared_Dirty:   %8lu kB\n"
+  "Private_Clean:  %8lu kB\n"
+  "Private_Dirty:  %8lu kB\n"
+  "Referenced: %8lu kB\n"
+  "Anonymous:  %8lu kB\n"
+  "AnonHugePages:  %8lu kB\n"
+  "Swap:   %8lu kB\n",
+  mss_sum.resident >> 10,
+  (unsigned long)(mss_sum.pss >> (10 + PSS_SHIFT)),
+  mss_sum.shared_clean  >> 10,
+  mss_sum.shared_dirty  >> 10,
+  mss_sum.private_clean >> 10,
+  mss_sum.private_dirty >> 10,
+  mss_sum.referenced >> 10,
+  mss_sum.anonymous >&

[PACTH v3 0/3] Implement /proc//totmaps

2016-08-16 Thread robert . foss

From: Robert Foss <robert.f...@collabora.com>

This series provides the /proc/PID/totmaps feature, which 
summarizes the information provided by /proc/PID/smaps for
improved performance and usability reasons.

A use case is to speed up monitoring of memory consumption in 
environments where RSS isn't precise.

For example Chrome tends to many processes which have hundreds of VMAs
with a substantial amount of shared memory, and the error of using
RSS rather than PSS tends to be very large when looking at overall
memory consumption.  PSS isn't kept as a single number that's exported
like RSS, so to calculate PSS means having to parse a very large smaps
file.

This process is slow and has to be repeated for many processes, and we
found that the just act of doing the parsing was taking up a
significant amount of CPU time, so this patch is an attempt to make
that process cheaper.

/proc/PID/totmaps provides roughly a 2x speedup compared to parsing
/proc/PID/smaps with awk.

$ /usr/bin/time -v -p zsh -c "(repeat 25 {cat /proc/5025/totmaps})"
[...]
Command being timed: "zsh -c (repeat 25 {cat /proc/5025/totmaps})"
User time (seconds): 0.00
System time (seconds): 0.40
Percent of CPU this job got: 90%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.45


$ /usr/bin/time -v -p zsh -c "repeat 25 { awk '/^Rss/{rss+=\$2} 
/^Pss/{pss+=\$2} END {printf \"rss:%d pss:%d\n\", rss, pss}\' /proc/5025/smaps}"
[...]
Command being timed: "zsh -c repeat 25 { awk '/^Rss/{rss+=$2}
/^Pss/{pss+=$2} END {printf "rss:%d pss:%d\n", rss, pss}\' /proc/5025/smaps }"
User time (seconds): 0.37
System time (seconds): 0.45
Percent of CPU this job got: 92%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.89

Robert Foss (3):
  mm, proc: Implement /proc//totmaps
  Documentation/filesystems: Fixed typo
  Documentation/filesystems: Added /proc/PID/totmaps documentation

 Documentation/filesystems/proc.txt |  23 ++-
 fs/proc/base.c |   1 +
 fs/proc/internal.h |   2 +
 fs/proc/task_mmu.c | 129 +
 4 files changed, 154 insertions(+), 1 deletion(-)

-- 
2.7.4

[PACTH v3 0/3] Implement /proc//totmaps

2016-08-16 Thread robert . foss

From: Robert Foss 

This series provides the /proc/PID/totmaps feature, which 
summarizes the information provided by /proc/PID/smaps for
improved performance and usability reasons.

A use case is to speed up monitoring of memory consumption in 
environments where RSS isn't precise.

For example Chrome tends to many processes which have hundreds of VMAs
with a substantial amount of shared memory, and the error of using
RSS rather than PSS tends to be very large when looking at overall
memory consumption.  PSS isn't kept as a single number that's exported
like RSS, so to calculate PSS means having to parse a very large smaps
file.

This process is slow and has to be repeated for many processes, and we
found that the just act of doing the parsing was taking up a
significant amount of CPU time, so this patch is an attempt to make
that process cheaper.

/proc/PID/totmaps provides roughly a 2x speedup compared to parsing
/proc/PID/smaps with awk.

$ /usr/bin/time -v -p zsh -c "(repeat 25 {cat /proc/5025/totmaps})"
[...]
Command being timed: "zsh -c (repeat 25 {cat /proc/5025/totmaps})"
User time (seconds): 0.00
System time (seconds): 0.40
Percent of CPU this job got: 90%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.45


$ /usr/bin/time -v -p zsh -c "repeat 25 { awk '/^Rss/{rss+=\$2} 
/^Pss/{pss+=\$2} END {printf \"rss:%d pss:%d\n\", rss, pss}\' /proc/5025/smaps}"
[...]
Command being timed: "zsh -c repeat 25 { awk '/^Rss/{rss+=$2}
/^Pss/{pss+=$2} END {printf "rss:%d pss:%d\n", rss, pss}\' /proc/5025/smaps }"
User time (seconds): 0.37
System time (seconds): 0.45
Percent of CPU this job got: 92%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.89

Robert Foss (3):
  mm, proc: Implement /proc//totmaps
  Documentation/filesystems: Fixed typo
  Documentation/filesystems: Added /proc/PID/totmaps documentation

 Documentation/filesystems/proc.txt |  23 ++-
 fs/proc/base.c |   1 +
 fs/proc/internal.h |   2 +
 fs/proc/task_mmu.c | 129 +
 4 files changed, 154 insertions(+), 1 deletion(-)

-- 
2.7.4

< 1 2 3 4 5 6 7 8 >

501 - 600 of 742 matches

Mail list logo