date:20121014

Re: [PATCH] power: replace strict_str* with kstrto*

2012-10-14 Thread Rafael J. Wysocki

On Thursday 27 of September 2012 14:31:56 Pavel Machek wrote:
> On Wed 2012-09-26 22:15:06, Daniel Walter wrote:
> > power: replace strict_strtoul with kstrtoul
> > 
> > Signed-off-by: Daniel Walter 
> 
> ACK.

Thanks!  I'll queue it up for v3.8 when I get back home from the current trip.

Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] sh updates for 3.7-rc2

2012-10-14 Thread Paul Mundt

The following changes since commit ddffeb8c4d0331609ef2581d84de4d763607bd37:

  Linux 3.7-rc1 (2012-10-14 14:41:04 -0700)

are available in the git repository at:

  git://github.com/pmundt/linux-sh tags/sh-for-linus

for you to fetch changes up to 0dd4d5cbe4c38165dc9b3ad329ebb23f24d74fdb:

  sh: Fix up more fallout from pointless ARM __iomem churn. (2012-10-15 
14:08:48 +0900)


SuperH updates for 3.7-rc2


David Howells (1):
  UAPI: (Scripted) Disintegrate arch/sh/include/asm

Paul Mundt (3):
  Merge tag 'disintegrate-sh-20121009' of 
git://git.infradead.org/users/dhowells/linux-headers into sh-latest
  sh: Wire up kcmp syscall.
  sh: Fix up more fallout from pointless ARM __iomem churn.

 arch/sh/include/asm/Kbuild  | 11 
 arch/sh/include/asm/hw_breakpoint.h |  4 +-
 arch/sh/include/asm/posix_types.h   |  8 ---
 arch/sh/include/asm/ptrace.h| 34 +--
 arch/sh/include/asm/ptrace_32.h | 75 +---
 arch/sh/include/asm/ptrace_64.h | 12 +---
 arch/sh/include/asm/setup.h |  5 +-
 arch/sh/include/asm/types.h |  5 +-
 arch/sh/include/asm/unistd.h|  9 +--
 arch/sh/include/uapi/asm/Kbuild | 22 +++
 arch/sh/include/{ => uapi}/asm/auxvec.h |  0
 arch/sh/include/{ => uapi}/asm/byteorder.h  |  0
 arch/sh/include/{ => uapi}/asm/cachectl.h   |  0
 arch/sh/include/{ => uapi}/asm/cpu-features.h   |  0
 arch/sh/include/{ => uapi}/asm/ioctls.h |  0
 arch/sh/include/uapi/asm/posix_types.h  |  7 +++
 arch/sh/include/{ => uapi}/asm/posix_types_32.h |  0
 arch/sh/include/{ => uapi}/asm/posix_types_64.h |  0
 arch/sh/include/uapi/asm/ptrace.h   | 34 +++
 arch/sh/include/uapi/asm/ptrace_32.h| 77 +
 arch/sh/include/uapi/asm/ptrace_64.h| 14 +
 arch/sh/include/uapi/asm/setup.h|  1 +
 arch/sh/include/{ => uapi}/asm/sigcontext.h |  0
 arch/sh/include/{ => uapi}/asm/signal.h |  0
 arch/sh/include/{ => uapi}/asm/sockios.h|  0
 arch/sh/include/{ => uapi}/asm/stat.h   |  0
 arch/sh/include/{ => uapi}/asm/swab.h   |  0
 arch/sh/include/uapi/asm/types.h|  1 +
 arch/sh/include/uapi/asm/unistd.h   |  7 +++
 arch/sh/include/{ => uapi}/asm/unistd_32.h  |  3 +-
 arch/sh/include/{ => uapi}/asm/unistd_64.h  |  3 +-
 arch/sh/kernel/syscalls_32.S|  1 +
 arch/sh/kernel/syscalls_64.S|  1 +
 drivers/sh/intc/access.c| 45 +--
 drivers/sh/intc/chip.c  |  4 +-
 drivers/tty/serial/sh-sci.c |  3 +-
 36 files changed, 210 insertions(+), 176 deletions(-)
 rename arch/sh/include/{ => uapi}/asm/auxvec.h (100%)
 rename arch/sh/include/{ => uapi}/asm/byteorder.h (100%)
 rename arch/sh/include/{ => uapi}/asm/cachectl.h (100%)
 rename arch/sh/include/{ => uapi}/asm/cpu-features.h (100%)
 create mode 100644 arch/sh/include/uapi/asm/hw_breakpoint.h
 rename arch/sh/include/{ => uapi}/asm/ioctls.h (100%)
 create mode 100644 arch/sh/include/uapi/asm/posix_types.h
 rename arch/sh/include/{ => uapi}/asm/posix_types_32.h (100%)
 rename arch/sh/include/{ => uapi}/asm/posix_types_64.h (100%)
 create mode 100644 arch/sh/include/uapi/asm/ptrace.h
 create mode 100644 arch/sh/include/uapi/asm/ptrace_32.h
 create mode 100644 arch/sh/include/uapi/asm/ptrace_64.h
 create mode 100644 arch/sh/include/uapi/asm/setup.h
 rename arch/sh/include/{ => uapi}/asm/sigcontext.h (100%)
 rename arch/sh/include/{ => uapi}/asm/signal.h (100%)
 rename arch/sh/include/{ => uapi}/asm/sockios.h (100%)
 rename arch/sh/include/{ => uapi}/asm/stat.h (100%)
 rename arch/sh/include/{ => uapi}/asm/swab.h (100%)
 create mode 100644 arch/sh/include/uapi/asm/types.h
 create mode 100644 arch/sh/include/uapi/asm/unistd.h
 rename arch/sh/include/{ => uapi}/asm/unistd_32.h (99%)
 rename arch/sh/include/{ => uapi}/asm/unistd_64.h (99%)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] base: power - use clk_prepare_enable and clk_prepare_disable

2012-10-14 Thread Rafael J. Wysocki

On Thursday 20 of September 2012 11:39:36 Murali Karicheri wrote:
> When PM runtime is enabled in DaVinci and the machine migrates to
> common clk framework, the clk_enable() gets called without
> clk_prepare(). This patch is to fix this issue so that PM run
> time can inter work with common clk framework.
> 
> Signed-off-by: Murali Karicheri 

OK.  I'm not seeing people having problems with this patch, so I'll tentatively
queue it up for v3.8.

Thanks,
Rafael


> diff --git a/drivers/base/power/clock_ops.c b/drivers/base/power/clock_ops.c
> index eb78e96..9d8fde7 100644
> --- a/drivers/base/power/clock_ops.c
> +++ b/drivers/base/power/clock_ops.c
> @@ -99,7 +99,7 @@ static void __pm_clk_remove(struct pm_clock_entry *ce)
>  
>   if (ce->status < PCE_STATUS_ERROR) {
>   if (ce->status == PCE_STATUS_ENABLED)
> - clk_disable(ce->clk);
> + clk_disable_unprepare(ce->clk);
>  
>   if (ce->status >= PCE_STATUS_ACQUIRED)
>   clk_put(ce->clk);
> @@ -396,7 +396,7 @@ static void enable_clock(struct device *dev, const char 
> *con_id)
>  
>   clk = clk_get(dev, con_id);
>   if (!IS_ERR(clk)) {
> - clk_enable(clk);
> + clk_prepare_enable(clk);
>   clk_put(clk);
>   dev_info(dev, "Runtime PM disabled, clock forced on.\n");
>   }
> @@ -413,7 +413,7 @@ static void disable_clock(struct device *dev, const char 
> *con_id)
>  
>   clk = clk_get(dev, con_id);
>   if (!IS_ERR(clk)) {
> - clk_disable(clk);
> + clk_disable_unprepare(clk);
>   clk_put(clk);
>   dev_info(dev, "Runtime PM disabled, clock forced off.\n");
>   }
> 
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] powerpc: added DSCR support to ptrace

2012-10-14 Thread Alexey Kardashevskiy

The DSCR (aka Data Stream Control Register) is supported on some
server PowerPC chips and allow some control over the prefetch
of data streams.

The kernel already supports DSCR value per thread but there is also
a need in a ability to change it from an external process for
the specific pid.

The patch adds new register index PT_DSCR (index=44) which can be
set/get by:
  ptrace(PTRACE_POKEUSER, traced_process, PT_DSCR << 3, dscr);
  dscr = ptrace(PTRACE_PEEKUSER, traced_process, PT_DSCR << 3, NULL);

Signed-off-by: Alexey Kardashevskiy 
---
 arch/powerpc/include/asm/ptrace.h |3 ++-
 arch/powerpc/kernel/ptrace.c  |   16 
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/ptrace.h 
b/arch/powerpc/include/asm/ptrace.h
index 84cc784..946c556 100644
--- a/arch/powerpc/include/asm/ptrace.h
+++ b/arch/powerpc/include/asm/ptrace.h
@@ -276,7 +276,8 @@ static inline unsigned long 
regs_get_kernel_stack_nth(struct pt_regs *regs,
 #define PT_DAR 41
 #define PT_DSISR 42
 #define PT_RESULT 43
-#define PT_REGS_COUNT 44
+#define PT_DSCR 44
+#define PT_REGS_COUNT 45
 
 #define PT_FPR048  /* each FP reg occupies 2 slots in this space */
 
diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index 8d8e028..4798acf 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -179,6 +179,17 @@ static int set_user_msr(struct task_struct *task, unsigned 
long msr)
return 0;
 }
 
+static unsigned long get_user_dscr(struct task_struct *task)
+{
+   return task->thread.dscr;
+}
+
+static int set_user_dscr(struct task_struct *task, unsigned long dscr)
+{
+   task->thread.dscr = dscr;
+   return 0;
+}
+
 /*
  * We prevent mucking around with the reserved area of trap
  * which are used internally by the kernel.
@@ -200,6 +211,9 @@ unsigned long ptrace_get_reg(struct task_struct *task, int 
regno)
if (regno == PT_MSR)
return get_user_msr(task);
 
+   if (regno == PT_DSCR)
+   return get_user_dscr(task);
+
if (regno < (sizeof(struct pt_regs) / sizeof(unsigned long)))
return ((unsigned long *)task->thread.regs)[regno];
 
@@ -218,6 +232,8 @@ int ptrace_put_reg(struct task_struct *task, int regno, 
unsigned long data)
return set_user_msr(task, data);
if (regno == PT_TRAP)
return set_user_trap(task, data);
+   if (regno == PT_DSCR)
+   return set_user_dscr(task, data);
 
if (regno <= PT_MAX_PUT_REG) {
((unsigned long *)task->thread.regs)[regno] = data;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] perf/x86: Avoid kfree() in CPU_{STARTING,DYING}

2012-10-14 Thread Yan, Zheng

From: "Yan, Zheng" 

On -rt kfree() can schedule, but CPU_{STARTING,DYING} should be
atomic. So use a list to defer kfree until CPU_{ONLINE,DEAD}.

Signed-off-by: Yan, Zheng 
---
 arch/x86/kernel/cpu/perf_event_intel_uncore.c | 27 ---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c 
b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
index 99d96a4..6df0e60 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -2611,6 +2611,20 @@ static void __init uncore_pci_exit(void)
}
 }
 
+static LIST_HEAD(boxes_to_free);
+
+static void __cpuinit uncore_kfree_boxes(void)
+{
+   struct intel_uncore_box *box;
+
+   while (!list_empty(&boxes_to_free)) {
+   box = list_entry(boxes_to_free.next,
+struct intel_uncore_box, list);
+   list_del(&box->list);
+   kfree(box);
+   }
+}
+
 static void __cpuinit uncore_cpu_dying(int cpu)
 {
struct intel_uncore_type *type;
@@ -2625,7 +2639,7 @@ static void __cpuinit uncore_cpu_dying(int cpu)
box = *per_cpu_ptr(pmu->box, cpu);
*per_cpu_ptr(pmu->box, cpu) = NULL;
if (box && atomic_dec_and_test(&box->refcnt))
-   kfree(box);
+   list_add(&box->list, &boxes_to_free);
}
}
 }
@@ -2655,8 +2669,11 @@ static int __cpuinit uncore_cpu_starting(int cpu)
if (exist && exist->phys_id == phys_id) {
atomic_inc(&exist->refcnt);
*per_cpu_ptr(pmu->box, cpu) = exist;
-   kfree(box);
-   box = NULL;
+   if (box) {
+   list_add(&box->list,
+&boxes_to_free);
+   box = NULL;
+   }
break;
}
}
@@ -2795,6 +2812,10 @@ static int
case CPU_DYING:
uncore_cpu_dying(cpu);
break;
+   case CPU_ONLINE:
+   case CPU_DEAD:
+   uncore_kfree_boxes();
+   break;
default:
break;
}
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 03/10] thp: copy_huge_pmd(): copy huge zero page

2012-10-14 Thread Kirill A. Shutemov

From: "Kirill A. Shutemov" 

It's easy to copy huge zero page. Just set destination pmd to huge zero
page.

It's safe to copy huge zero page since we have none yet :-p

Signed-off-by: Kirill A. Shutemov 
---
 mm/huge_memory.c |   17 +
 1 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 680c27f..9f5e5cb 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -700,6 +700,18 @@ static inline struct page *alloc_hugepage(int defrag)
 }
 #endif
 
+static void set_huge_zero_page(pgtable_t pgtable, struct mm_struct *mm,
+   struct vm_area_struct *vma, unsigned long haddr, pmd_t *pmd)
+{
+   pmd_t entry;
+   entry = pfn_pmd(huge_zero_pfn, vma->vm_page_prot);
+   entry = pmd_wrprotect(entry);
+   entry = pmd_mkhuge(entry);
+   set_pmd_at(mm, haddr, pmd, entry);
+   pgtable_trans_huge_deposit(mm, pgtable);
+   mm->nr_ptes++;
+}
+
 int do_huge_pmd_anonymous_page(struct mm_struct *mm, struct vm_area_struct 
*vma,
   unsigned long address, pmd_t *pmd,
   unsigned int flags)
@@ -777,6 +789,11 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct 
mm_struct *src_mm,
pte_free(dst_mm, pgtable);
goto out_unlock;
}
+   if (is_huge_zero_pmd(pmd)) {
+   set_huge_zero_page(pgtable, dst_mm, vma, addr, dst_pmd);
+   ret = 0;
+   goto out_unlock;
+   }
if (unlikely(pmd_trans_splitting(pmd))) {
/* split huge page running from under us */
spin_unlock(&src_mm->page_table_lock);
-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 04/10] thp: do_huge_pmd_wp_page(): handle huge zero page

2012-10-14 Thread Kirill A. Shutemov

From: "Kirill A. Shutemov" 

On write access to huge zero page we alloc a new huge page and clear it.

If ENOMEM, graceful fallback: we create a new pmd table and set pte
around fault address to newly allocated normal (4k) page. All other ptes
in the pmd set to normal zero page.

Signed-off-by: Kirill A. Shutemov 
---
 include/linux/mm.h |8 +++
 mm/huge_memory.c   |  129 ++--
 mm/memory.c|7 ---
 3 files changed, 122 insertions(+), 22 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index fa06804..fe329da 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -516,6 +516,14 @@ static inline pte_t maybe_mkwrite(pte_t pte, struct 
vm_area_struct *vma)
 }
 #endif
 
+#ifndef my_zero_pfn
+static inline unsigned long my_zero_pfn(unsigned long addr)
+{
+   extern unsigned long zero_pfn;
+   return zero_pfn;
+}
+#endif
+
 /*
  * Multiple processes may "see" the same page. E.g. for untouched
  * mappings of /dev/null, all processes see the same page full of
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 9f5e5cb..76548b1 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -823,6 +823,88 @@ out:
return ret;
 }
 
+/* no "address" argument so destroys page coloring of some arch */
+pgtable_t get_pmd_huge_pte(struct mm_struct *mm)
+{
+   pgtable_t pgtable;
+
+   assert_spin_locked(&mm->page_table_lock);
+
+   /* FIFO */
+   pgtable = mm->pmd_huge_pte;
+   if (list_empty(&pgtable->lru))
+   mm->pmd_huge_pte = NULL;
+   else {
+   mm->pmd_huge_pte = list_entry(pgtable->lru.next,
+ struct page, lru);
+   list_del(&pgtable->lru);
+   }
+   return pgtable;
+}
+
+static int do_huge_pmd_wp_zero_page_fallback(struct mm_struct *mm,
+   struct vm_area_struct *vma, unsigned long address,
+   pmd_t *pmd, unsigned long haddr)
+{
+   pgtable_t pgtable;
+   pmd_t _pmd;
+   struct page *page;
+   int i, ret = 0;
+   unsigned long mmun_start;   /* For mmu_notifiers */
+   unsigned long mmun_end; /* For mmu_notifiers */
+
+   page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma, address);
+   if (!page) {
+   ret |= VM_FAULT_OOM;
+   goto out;
+   }
+
+   if (mem_cgroup_newpage_charge(page, mm, GFP_KERNEL)) {
+   put_page(page);
+   ret |= VM_FAULT_OOM;
+   goto out;
+   }
+
+   clear_user_highpage(page, address);
+   __SetPageUptodate(page);
+
+   mmun_start = haddr;
+   mmun_end   = haddr + HPAGE_PMD_SIZE;
+   mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end);
+
+   spin_lock(&mm->page_table_lock);
+   pmdp_clear_flush(vma, haddr, pmd);
+   /* leave pmd empty until pte is filled */
+
+   pgtable = get_pmd_huge_pte(mm);
+   pmd_populate(mm, &_pmd, pgtable);
+
+   for (i = 0; i < HPAGE_PMD_NR; i++, haddr += PAGE_SIZE) {
+   pte_t *pte, entry;
+   if (haddr == (address & PAGE_MASK)) {
+   entry = mk_pte(page, vma->vm_page_prot);
+   entry = maybe_mkwrite(pte_mkdirty(entry), vma);
+   page_add_new_anon_rmap(page, vma, haddr);
+   } else {
+   entry = pfn_pte(my_zero_pfn(haddr), vma->vm_page_prot);
+   entry = pte_mkspecial(entry);
+   }
+   pte = pte_offset_map(&_pmd, haddr);
+   VM_BUG_ON(!pte_none(*pte));
+   set_pte_at(mm, haddr, pte, entry);
+   pte_unmap(pte);
+   }
+   smp_wmb(); /* make pte visible before pmd */
+   pmd_populate(mm, pmd, pgtable);
+   spin_unlock(&mm->page_table_lock);
+
+   mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
+
+   ret |= VM_FAULT_WRITE;
+out:
+   return ret;
+}
+
 static int do_huge_pmd_wp_page_fallback(struct mm_struct *mm,
struct vm_area_struct *vma,
unsigned long address,
@@ -929,19 +1011,21 @@ int do_huge_pmd_wp_page(struct mm_struct *mm, struct 
vm_area_struct *vma,
unsigned long address, pmd_t *pmd, pmd_t orig_pmd)
 {
int ret = 0;
-   struct page *page, *new_page;
+   struct page *page = NULL, *new_page;
unsigned long haddr;
unsigned long mmun_start;   /* For mmu_notifiers */
unsigned long mmun_end; /* For mmu_notifiers */
 
VM_BUG_ON(!vma->anon_vma);
+   haddr = address & HPAGE_PMD_MASK;
+   if (is_huge_zero_pmd(orig_pmd))
+   goto alloc;
spin_lock(&mm->page_table_lock);
if (unlikely(!pmd_same(*pmd, orig_pmd)))
goto out_unlock;
 
page = pmd_page(orig_pmd);
VM_BUG_ON(!PageCompound(page) || !PageHead(page));
-   haddr = addr

[PATCH v4 06/10] thp: change split_huge_page_pmd() interface

2012-10-14 Thread Kirill A. Shutemov

From: "Kirill A. Shutemov" 

Pass vma instead of mm and add address parameter.

In most cases we already have vma on the stack. We provides
split_huge_page_pmd_mm() for few cases when we have mm, but not vma.

This change is preparation to huge zero pmd splitting implementation.

Signed-off-by: Kirill A. Shutemov 
---
 Documentation/vm/transhuge.txt |4 ++--
 arch/x86/kernel/vm86_32.c  |2 +-
 fs/proc/task_mmu.c |2 +-
 include/linux/huge_mm.h|   14 ++
 mm/huge_memory.c   |   24 +++-
 mm/memory.c|4 ++--
 mm/mempolicy.c |2 +-
 mm/mprotect.c  |2 +-
 mm/mremap.c|2 +-
 mm/pagewalk.c  |2 +-
 10 files changed, 39 insertions(+), 19 deletions(-)

diff --git a/Documentation/vm/transhuge.txt b/Documentation/vm/transhuge.txt
index f734bb2..677a599 100644
--- a/Documentation/vm/transhuge.txt
+++ b/Documentation/vm/transhuge.txt
@@ -276,7 +276,7 @@ unaffected. libhugetlbfs will also work fine as usual.
 == Graceful fallback ==
 
 Code walking pagetables but unware about huge pmds can simply call
-split_huge_page_pmd(mm, pmd) where the pmd is the one returned by
+split_huge_page_pmd(vma, pmd, addr) where the pmd is the one returned by
 pmd_offset. It's trivial to make the code transparent hugepage aware
 by just grepping for "pmd_offset" and adding split_huge_page_pmd where
 missing after pmd_offset returns the pmd. Thanks to the graceful
@@ -299,7 +299,7 @@ diff --git a/mm/mremap.c b/mm/mremap.c
return NULL;
 
pmd = pmd_offset(pud, addr);
-+  split_huge_page_pmd(mm, pmd);
++  split_huge_page_pmd(vma, pmd, addr);
if (pmd_none_or_clear_bad(pmd))
return NULL;
 
diff --git a/arch/x86/kernel/vm86_32.c b/arch/x86/kernel/vm86_32.c
index 5c9687b..1dfe69c 100644
--- a/arch/x86/kernel/vm86_32.c
+++ b/arch/x86/kernel/vm86_32.c
@@ -182,7 +182,7 @@ static void mark_screen_rdonly(struct mm_struct *mm)
if (pud_none_or_clear_bad(pud))
goto out;
pmd = pmd_offset(pud, 0xA);
-   split_huge_page_pmd(mm, pmd);
+   split_huge_page_pmd_mm(mm, 0xA, pmd);
if (pmd_none_or_clear_bad(pmd))
goto out;
pte = pte_offset_map_lock(mm, pmd, 0xA, &ptl);
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 79827ce..866aa48 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -597,7 +597,7 @@ static int clear_refs_pte_range(pmd_t *pmd, unsigned long 
addr,
spinlock_t *ptl;
struct page *page;
 
-   split_huge_page_pmd(walk->mm, pmd);
+   split_huge_page_pmd(vma, addr, pmd);
if (pmd_trans_unstable(pmd))
return 0;
 
diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index b31cb7d..856f080 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -91,12 +91,14 @@ extern int handle_pte_fault(struct mm_struct *mm,
struct vm_area_struct *vma, unsigned long address,
pte_t *pte, pmd_t *pmd, unsigned int flags);
 extern int split_huge_page(struct page *page);
-extern void __split_huge_page_pmd(struct mm_struct *mm, pmd_t *pmd);
-#define split_huge_page_pmd(__mm, __pmd)   \
+extern void __split_huge_page_pmd(struct vm_area_struct *vma,
+   unsigned long address, pmd_t *pmd);
+#define split_huge_page_pmd(__vma, __address, __pmd)   \
do {\
pmd_t *pmd = (__pmd);   \
if (unlikely(pmd_trans_huge(*pmd))) \
-   __split_huge_page_pmd(__mm, pmd);   \
+   __split_huge_page_pmd(__vma, __address, \
+   pmd);   \
}  while (0)
 #define wait_split_huge_page(__anon_vma, __pmd)
\
do {\
@@ -106,6 +108,8 @@ extern void __split_huge_page_pmd(struct mm_struct *mm, 
pmd_t *pmd);
BUG_ON(pmd_trans_splitting(*pmd) || \
   pmd_trans_huge(*pmd));   \
} while (0)
+extern void split_huge_page_pmd_mm(struct mm_struct *mm, unsigned long address,
+   pmd_t *pmd);
 #if HPAGE_PMD_ORDER > MAX_ORDER
 #error "hugepages can't be allocated by the buddy allocator"
 #endif
@@ -173,10 +177,12 @@ static inline int split_huge_page(struct page *page)
 {
return 0;
 }
-#define split_huge_page_pmd(__mm, __pmd)   \
+#define split_huge_page_pmd(__vma, __address, __pmd)   \
do { } while (0)
 #define wait_split_huge_page(__anon_vma, __pmd)\
do { } while (0)
+#define split_huge_page_pmd_mm(__mm, __add

[PATCH v4 10/10] thp: implement refcounting for huge zero page

2012-10-14 Thread Kirill A. Shutemov

From: "Kirill A. Shutemov" 

H. Peter Anvin doesn't like huge zero page which sticks in memory forever
after the first allocation. Here's implementation of lockless refcounting
for huge zero page.

We have two basic primitives: {get,put}_huge_zero_page(). They
manipulate reference counter.

If counter is 0, get_huge_zero_page() allocates a new huge page and
takes two references: one for caller and one for shrinker. We free the
page only in shrinker callback if counter is 1 (only shrinker has the
reference).

put_huge_zero_page() only decrements counter. Counter is never zero
in put_huge_zero_page() since shrinker holds on reference.

Freeing huge zero page in shrinker callback helps to avoid frequent
allocate-free.

Refcounting has cost. On 4 socket machine I observe ~1% slowdown on
parallel (40 processes) read page faulting comparing to lazy huge page
allocation.  I think it's pretty reasonable for synthetic benchmark.

Signed-off-by: Kirill A. Shutemov 
---
 mm/huge_memory.c |  111 ++
 1 files changed, 87 insertions(+), 24 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 8fae26a..a4f2110 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include "internal.h"
@@ -46,7 +47,6 @@ static unsigned int khugepaged_scan_sleep_millisecs 
__read_mostly = 1;
 /* during fragmentation poll the hugepage allocator once every minute */
 static unsigned int khugepaged_alloc_sleep_millisecs __read_mostly = 6;
 static struct task_struct *khugepaged_thread __read_mostly;
-static unsigned long huge_zero_pfn __read_mostly;
 static DEFINE_MUTEX(khugepaged_mutex);
 static DEFINE_SPINLOCK(khugepaged_mm_lock);
 static DECLARE_WAIT_QUEUE_HEAD(khugepaged_wait);
@@ -159,31 +159,74 @@ static int start_khugepaged(void)
return err;
 }
 
-static int init_huge_zero_pfn(void)
+static atomic_t huge_zero_refcount;
+static unsigned long huge_zero_pfn __read_mostly;
+
+static inline bool is_huge_zero_pfn(unsigned long pfn)
 {
-   struct page *hpage;
-   unsigned long pfn;
+   unsigned long zero_pfn = ACCESS_ONCE(huge_zero_pfn);
+   return zero_pfn && pfn == zero_pfn;
+}
+
+static inline bool is_huge_zero_pmd(pmd_t pmd)
+{
+   return is_huge_zero_pfn(pmd_pfn(pmd));
+}
+
+static unsigned long get_huge_zero_page(void)
+{
+   struct page *zero_page;
+retry:
+   if (likely(atomic_inc_not_zero(&huge_zero_refcount)))
+   return ACCESS_ONCE(huge_zero_pfn);
 
-   hpage = alloc_pages((GFP_TRANSHUGE | __GFP_ZERO) & ~__GFP_MOVABLE,
+   zero_page = alloc_pages((GFP_TRANSHUGE | __GFP_ZERO) & ~__GFP_MOVABLE,
HPAGE_PMD_ORDER);
-   if (!hpage)
-   return -ENOMEM;
-   pfn = page_to_pfn(hpage);
-   if (cmpxchg(&huge_zero_pfn, 0, pfn))
-   __free_page(hpage);
-   return 0;
+   if (!zero_page)
+   return 0;
+   preempt_disable();
+   if (cmpxchg(&huge_zero_pfn, 0, page_to_pfn(zero_page))) {
+   preempt_enable();
+   __free_page(zero_page);
+   goto retry;
+   }
+
+   /* We take additional reference here. It will be put back by shrinker */
+   atomic_set(&huge_zero_refcount, 2);
+   preempt_enable();
+   return ACCESS_ONCE(huge_zero_pfn);
 }
 
-static inline bool is_huge_zero_pfn(unsigned long pfn)
+static void put_huge_zero_page(void)
 {
-   return huge_zero_pfn && pfn == huge_zero_pfn;
+   /*
+* Counter should never go to zero here. Only shrinker can put
+* last reference.
+*/
+   BUG_ON(atomic_dec_and_test(&huge_zero_refcount));
 }
 
-static inline bool is_huge_zero_pmd(pmd_t pmd)
+static int shrink_huge_zero_page(struct shrinker *shrink,
+   struct shrink_control *sc)
 {
-   return is_huge_zero_pfn(pmd_pfn(pmd));
+   if (!sc->nr_to_scan)
+   /* we can free zero page only if last reference remains */
+   return atomic_read(&huge_zero_refcount) == 1 ? HPAGE_PMD_NR : 0;
+
+   if (atomic_cmpxchg(&huge_zero_refcount, 1, 0) == 1) {
+   unsigned long zero_pfn = xchg(&huge_zero_pfn, 0);
+   BUG_ON(zero_pfn == 0);
+   __free_page(__pfn_to_page(zero_pfn));
+   }
+
+   return 0;
 }
 
+static struct shrinker huge_zero_page_shrinker = {
+   .shrink = shrink_huge_zero_page,
+   .seeks = DEFAULT_SEEKS,
+};
+
 #ifdef CONFIG_SYSFS
 
 static ssize_t double_flag_show(struct kobject *kobj,
@@ -575,6 +618,8 @@ static int __init hugepage_init(void)
goto out;
}
 
+   register_shrinker(&huge_zero_page_shrinker);
+
/*
 * By default disable transparent hugepages on smaller systems,
 * where the extra memory used could hurt more than TLB overhead
@@ -697,10 +742,11 @@ static inline struct page *alloc_hugepage(int defrag)
 #endif
 
 static voi

[PATCH v4 02/10] thp: zap_huge_pmd(): zap huge zero pmd

2012-10-14 Thread Kirill A. Shutemov

From: "Kirill A. Shutemov" 

We don't have a real page to zap in huge zero page case. Let's just
clear pmd and remove it from tlb.

Signed-off-by: Kirill A. Shutemov 
---
 mm/huge_memory.c |   21 +
 1 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 438adbf..680c27f 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1057,15 +1057,20 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct 
vm_area_struct *vma,
pmd_t orig_pmd;
pgtable = pgtable_trans_huge_withdraw(tlb->mm);
orig_pmd = pmdp_get_and_clear(tlb->mm, addr, pmd);
-   page = pmd_page(orig_pmd);
tlb_remove_pmd_tlb_entry(tlb, pmd, addr);
-   page_remove_rmap(page);
-   VM_BUG_ON(page_mapcount(page) < 0);
-   add_mm_counter(tlb->mm, MM_ANONPAGES, -HPAGE_PMD_NR);
-   VM_BUG_ON(!PageHead(page));
-   tlb->mm->nr_ptes--;
-   spin_unlock(&tlb->mm->page_table_lock);
-   tlb_remove_page(tlb, page);
+   if (is_huge_zero_pmd(orig_pmd)) {
+   tlb->mm->nr_ptes--;
+   spin_unlock(&tlb->mm->page_table_lock);
+   } else {
+   page = pmd_page(orig_pmd);
+   page_remove_rmap(page);
+   VM_BUG_ON(page_mapcount(page) < 0);
+   add_mm_counter(tlb->mm, MM_ANONPAGES, -HPAGE_PMD_NR);
+   VM_BUG_ON(!PageHead(page));
+   tlb->mm->nr_ptes--;
+   spin_unlock(&tlb->mm->page_table_lock);
+   tlb_remove_page(tlb, page);
+   }
pte_free(tlb->mm, pgtable);
ret = 1;
}
-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 00/10, REBASED] Introduce huge zero page

2012-10-14 Thread Kirill A. Shutemov

From: "Kirill A. Shutemov" 

Hi,

Andrew, here's huge zero page patchset rebased to v3.7-rc1.

Andrea, I've dropped your Reviewed-by due not-so-trivial conflicts in during
rebase. Could you look through it again. Patches 2, 3, 4, 7, 10 had conflicts.
Mostly due new MMU notifiers interface.

=

During testing I noticed big (up to 2.5 times) memory consumption overhead
on some workloads (e.g. ft.A from NPB) if THP is enabled.

The main reason for that big difference is lacking zero page in THP case.
We have to allocate a real page on read page fault.

A program to demonstrate the issue:
#include 
#include 
#include 

#define MB 1024*1024

int main(int argc, char **argv)
{
char *p;
int i;

posix_memalign((void **)&p, 2 * MB, 200 * MB);
for (i = 0; i < 200 * MB; i+= 4096)
assert(p[i] == 0);
pause();
return 0;
}

With thp-never RSS is about 400k, but with thp-always it's 200M.
After the patcheset thp-always RSS is 400k too.

Design overview.

Huge zero page (hzp) is a non-movable huge page (2M on x86-64) filled with
zeros.  The way how we allocate it changes in the patchset:

- [01/10] simplest way: hzp allocated on boot time in hugepage_init();
- [09/10] lazy allocation on first use;
- [10/10] lockless refcounting + shrinker-reclaimable hzp;

We setup it in do_huge_pmd_anonymous_page() if area around fault address
is suitable for THP and we've got read page fault.
If we fail to setup hzp (ENOMEM) we fallback to handle_pte_fault() as we
normally do in THP.

On wp fault to hzp we allocate real memory for the huge page and clear it.
If ENOMEM, graceful fallback: we create a new pmd table and set pte around
fault address to newly allocated normal (4k) page. All other ptes in the
pmd set to normal zero page.

We cannot split hzp (and it's bug if we try), but we can split the pmd
which points to it. On splitting the pmd we create a table with all ptes
set to normal zero page.

Patchset organized in bisect-friendly way:
 Patches 01-07: prepare all code paths for hzp
 Patch 08: all code paths are covered: safe to setup hzp
 Patch 09: lazy allocation
 Patch 10: lockless refcounting for hzp

v4:
 - Rebase to v3.7-rc1;
 - Update commit message;
v3:
 - fix potential deadlock in refcounting code on preemptive kernel.
 - do not mark huge zero page as movable.
 - fix typo in comment.
 - Reviewed-by tag from Andrea Arcangeli.
v2:
 - Avoid find_vma() if we've already had vma on stack.
   Suggested by Andrea Arcangeli.
 - Implement refcounting for huge zero page.

--

By hpa request I've tried alternative approach for hzp implementation (see
Virtual huge zero page patchset): pmd table with all entries set to zero
page. This way should be more cache friendly, but it increases TLB
pressure.

The problem with virtual huge zero page: it requires per-arch enabling.
We need a way to mark that pmd table has all ptes set to zero page.

Some numbers to compare two implementations (on 4s Westmere-EX):

Mirobenchmark1
==

test:
posix_memalign((void **)&p, 2 * MB, 8 * GB);
for (i = 0; i < 100; i++) {
assert(memcmp(p, p + 4*GB, 4*GB) == 0);
asm volatile ("": : :"memory");
}

hzp:
 Performance counter stats for './test_memcmp' (5 runs):

  32356.272845 task-clock#0.998 CPUs utilized   
 ( +-  0.13% )
40 context-switches  #0.001 K/sec   
 ( +-  0.94% )
 0 CPU-migrations#0.000 K/sec
 4,218 page-faults   #0.130 K/sec   
 ( +-  0.00% )
76,712,481,765 cycles#2.371 GHz 
 ( +-  0.13% ) [83.31%]
36,279,577,636 stalled-cycles-frontend   #   47.29% frontend cycles idle
 ( +-  0.28% ) [83.35%]
 1,684,049,110 stalled-cycles-backend#2.20% backend  cycles idle
 ( +-  2.96% ) [66.67%]
   134,355,715,816 instructions  #1.75  insns per cycle
 #0.27  stalled cycles per insn 
 ( +-  0.10% ) [83.35%]
13,526,169,702 branches  #  418.039 M/sec   
 ( +-  0.10% ) [83.31%]
 1,058,230 branch-misses #0.01% of all branches 
 ( +-  0.91% ) [83.36%]

  32.413866442 seconds time elapsed 
 ( +-  0.13% )

vhzp:
 Performance counter stats for './test_memcmp' (5 runs):

  30327.183829 task-clock#0.998 CPUs utilized   
 ( +-  0.13% )
38 context-switches  #0.001 K/sec   
 ( +-  1.53% )
 0 CPU-migrations#0.000 K/sec
 4,218 page-faults   #0.139 K/sec   
 ( +-  0.01% )
71,964,773,660 cycles#2.373

[PATCH v4 05/10] thp: change_huge_pmd(): keep huge zero page write-protected

2012-10-14 Thread Kirill A. Shutemov

From: "Kirill A. Shutemov" 

We want to get page fault on write attempt to huge zero page, so let's
keep it write-protected.

Signed-off-by: Kirill A. Shutemov 
---
 mm/huge_memory.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 76548b1..8dbb1e4 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1258,6 +1258,8 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t 
*pmd,
pmd_t entry;
entry = pmdp_get_and_clear(mm, addr, pmd);
entry = pmd_modify(entry, newprot);
+   if (is_huge_zero_pmd(entry))
+   entry = pmd_wrprotect(entry);
set_pmd_at(mm, addr, pmd, entry);
spin_unlock(&vma->vm_mm->page_table_lock);
ret = 1;
-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 08/10] thp: setup huge zero page on non-write page fault

2012-10-14 Thread Kirill A. Shutemov

From: "Kirill A. Shutemov" 

All code paths seems covered. Now we can map huge zero page on read page
fault.

We setup it in do_huge_pmd_anonymous_page() if area around fault address
is suitable for THP and we've got read page fault.

If we fail to setup huge zero page (ENOMEM) we fallback to
handle_pte_fault() as we normally do in THP.

Signed-off-by: Kirill A. Shutemov 
---
 mm/huge_memory.c |   10 ++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index b267b12..da7e07b 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -725,6 +725,16 @@ int do_huge_pmd_anonymous_page(struct mm_struct *mm, 
struct vm_area_struct *vma,
return VM_FAULT_OOM;
if (unlikely(khugepaged_enter(vma)))
return VM_FAULT_OOM;
+   if (!(flags & FAULT_FLAG_WRITE)) {
+   pgtable_t pgtable;
+   pgtable = pte_alloc_one(mm, haddr);
+   if (unlikely(!pgtable))
+   goto out;
+   spin_lock(&mm->page_table_lock);
+   set_huge_zero_page(pgtable, mm, vma, haddr, pmd);
+   spin_unlock(&mm->page_table_lock);
+   return 0;
+   }
page = alloc_hugepage_vma(transparent_hugepage_defrag(vma),
  vma, haddr, numa_node_id(), 0);
if (unlikely(!page)) {
-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 07/10] thp: implement splitting pmd for huge zero page

2012-10-14 Thread Kirill A. Shutemov

From: "Kirill A. Shutemov" 

We can't split huge zero page itself (and it's bug if we try), but we
can split the pmd which points to it.

On splitting the pmd we create a table with all ptes set to normal zero
page.

Signed-off-by: Kirill A. Shutemov 
---
 mm/huge_memory.c |   47 ---
 1 files changed, 44 insertions(+), 3 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 87359f1..b267b12 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1610,6 +1610,7 @@ int split_huge_page(struct page *page)
struct anon_vma *anon_vma;
int ret = 1;
 
+   BUG_ON(is_huge_zero_pfn(page_to_pfn(page)));
BUG_ON(!PageAnon(page));
anon_vma = page_lock_anon_vma(page);
if (!anon_vma)
@@ -2508,23 +2509,63 @@ static int khugepaged(void *none)
return 0;
 }
 
+static void __split_huge_zero_page_pmd(struct vm_area_struct *vma,
+   unsigned long haddr, pmd_t *pmd)
+{
+   pgtable_t pgtable;
+   pmd_t _pmd;
+   int i;
+
+   pmdp_clear_flush(vma, haddr, pmd);
+   /* leave pmd empty until pte is filled */
+
+   pgtable = get_pmd_huge_pte(vma->vm_mm);
+   pmd_populate(vma->vm_mm, &_pmd, pgtable);
+
+   for (i = 0; i < HPAGE_PMD_NR; i++, haddr += PAGE_SIZE) {
+   pte_t *pte, entry;
+   entry = pfn_pte(my_zero_pfn(haddr), vma->vm_page_prot);
+   entry = pte_mkspecial(entry);
+   pte = pte_offset_map(&_pmd, haddr);
+   VM_BUG_ON(!pte_none(*pte));
+   set_pte_at(vma->vm_mm, haddr, pte, entry);
+   pte_unmap(pte);
+   }
+   smp_wmb(); /* make pte visible before pmd */
+   pmd_populate(vma->vm_mm, pmd, pgtable);
+}
+
 void __split_huge_page_pmd(struct vm_area_struct *vma, unsigned long address,
pmd_t *pmd)
 {
struct page *page;
+   struct mm_struct *mm = vma->vm_mm;
unsigned long haddr = address & HPAGE_PMD_MASK;
+   unsigned long mmun_start;   /* For mmu_notifiers */
+   unsigned long mmun_end; /* For mmu_notifiers */
 
BUG_ON(vma->vm_start > haddr || vma->vm_end < haddr + HPAGE_PMD_SIZE);
 
-   spin_lock(&vma->vm_mm->page_table_lock);
+   mmun_start = haddr;
+   mmun_end   = address + HPAGE_PMD_SIZE;
+   mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end);
+   spin_lock(&mm->page_table_lock);
if (unlikely(!pmd_trans_huge(*pmd))) {
-   spin_unlock(&vma->vm_mm->page_table_lock);
+   spin_unlock(&mm->page_table_lock);
+   mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
+   return;
+   }
+   if (is_huge_zero_pmd(*pmd)) {
+   __split_huge_zero_page_pmd(vma, haddr, pmd);
+   spin_unlock(&mm->page_table_lock);
+   mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
return;
}
page = pmd_page(*pmd);
VM_BUG_ON(!page_count(page));
get_page(page);
-   spin_unlock(&vma->vm_mm->page_table_lock);
+   spin_unlock(&mm->page_table_lock);
+   mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
 
split_huge_page(page);
 
-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 09/10] thp: lazy huge zero page allocation

2012-10-14 Thread Kirill A. Shutemov

From: "Kirill A. Shutemov" 

Instead of allocating huge zero page on hugepage_init() we can postpone it
until first huge zero page map. It saves memory if THP is not in use.

cmpxchg() is used to avoid race on huge_zero_pfn initialization.

Signed-off-by: Kirill A. Shutemov 
---
 mm/huge_memory.c |   20 ++--
 1 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index da7e07b..8fae26a 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -159,22 +159,24 @@ static int start_khugepaged(void)
return err;
 }
 
-static int init_huge_zero_page(void)
+static int init_huge_zero_pfn(void)
 {
struct page *hpage;
+   unsigned long pfn;
 
hpage = alloc_pages((GFP_TRANSHUGE | __GFP_ZERO) & ~__GFP_MOVABLE,
HPAGE_PMD_ORDER);
if (!hpage)
return -ENOMEM;
-
-   huge_zero_pfn = page_to_pfn(hpage);
+   pfn = page_to_pfn(hpage);
+   if (cmpxchg(&huge_zero_pfn, 0, pfn))
+   __free_page(hpage);
return 0;
 }
 
 static inline bool is_huge_zero_pfn(unsigned long pfn)
 {
-   return pfn == huge_zero_pfn;
+   return huge_zero_pfn && pfn == huge_zero_pfn;
 }
 
 static inline bool is_huge_zero_pmd(pmd_t pmd)
@@ -563,10 +565,6 @@ static int __init hugepage_init(void)
if (err)
return err;
 
-   err = init_huge_zero_page();
-   if (err)
-   goto out;
-
err = khugepaged_slab_init();
if (err)
goto out;
@@ -589,8 +587,6 @@ static int __init hugepage_init(void)
 
return 0;
 out:
-   if (huge_zero_pfn)
-   __free_page(pfn_to_page(huge_zero_pfn));
hugepage_exit_sysfs(hugepage_kobj);
return err;
 }
@@ -727,6 +723,10 @@ int do_huge_pmd_anonymous_page(struct mm_struct *mm, 
struct vm_area_struct *vma,
return VM_FAULT_OOM;
if (!(flags & FAULT_FLAG_WRITE)) {
pgtable_t pgtable;
+   if (unlikely(!huge_zero_pfn && init_huge_zero_pfn())) {
+   count_vm_event(THP_FAULT_FALLBACK);
+   goto out;
+   }
pgtable = pte_alloc_one(mm, haddr);
if (unlikely(!pgtable))
goto out;
-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 01/10] thp: huge zero page: basic preparation

2012-10-14 Thread Kirill A. Shutemov

From: "Kirill A. Shutemov" 

Huge zero page (hzp) is a non-movable huge page (2M on x86-64) filled
with zeros.

For now let's allocate the page on hugepage_init(). We'll switch to lazy
allocation later.

We are not going to map the huge zero page until we can handle it
properly on all code paths.

is_huge_zero_{pfn,pmd}() functions will be used by following patches to
check whether the pfn/pmd is huge zero page.

Signed-off-by: Kirill A. Shutemov 
---
 mm/huge_memory.c |   30 ++
 1 files changed, 30 insertions(+), 0 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index a863af2..438adbf 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -46,6 +46,7 @@ static unsigned int khugepaged_scan_sleep_millisecs 
__read_mostly = 1;
 /* during fragmentation poll the hugepage allocator once every minute */
 static unsigned int khugepaged_alloc_sleep_millisecs __read_mostly = 6;
 static struct task_struct *khugepaged_thread __read_mostly;
+static unsigned long huge_zero_pfn __read_mostly;
 static DEFINE_MUTEX(khugepaged_mutex);
 static DEFINE_SPINLOCK(khugepaged_mm_lock);
 static DECLARE_WAIT_QUEUE_HEAD(khugepaged_wait);
@@ -158,6 +159,29 @@ static int start_khugepaged(void)
return err;
 }
 
+static int init_huge_zero_page(void)
+{
+   struct page *hpage;
+
+   hpage = alloc_pages((GFP_TRANSHUGE | __GFP_ZERO) & ~__GFP_MOVABLE,
+   HPAGE_PMD_ORDER);
+   if (!hpage)
+   return -ENOMEM;
+
+   huge_zero_pfn = page_to_pfn(hpage);
+   return 0;
+}
+
+static inline bool is_huge_zero_pfn(unsigned long pfn)
+{
+   return pfn == huge_zero_pfn;
+}
+
+static inline bool is_huge_zero_pmd(pmd_t pmd)
+{
+   return is_huge_zero_pfn(pmd_pfn(pmd));
+}
+
 #ifdef CONFIG_SYSFS
 
 static ssize_t double_flag_show(struct kobject *kobj,
@@ -539,6 +563,10 @@ static int __init hugepage_init(void)
if (err)
return err;
 
+   err = init_huge_zero_page();
+   if (err)
+   goto out;
+
err = khugepaged_slab_init();
if (err)
goto out;
@@ -561,6 +589,8 @@ static int __init hugepage_init(void)
 
return 0;
 out:
+   if (huge_zero_pfn)
+   __free_page(pfn_to_page(huge_zero_pfn));
hugepage_exit_sysfs(hugepage_kobj);
return err;
 }
-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 3/3] Convert mce_disabled

2012-10-14 Thread Naveen N. Rao


On 10/12/2012 05:26 PM, Borislav Petkov wrote:

On Fri, Oct 12, 2012 at 04:20:40PM +0530, Naveen N. Rao wrote:

Hi Boris, Thanks for getting to this before I could!


Ah ok, I thought you wasn't interested in doing this anymore :).


Sorry - just got sidetracked a bit, I'm afraid :)




I had a look but I still feel boolean is a better way to go. With
bool, we can get rid of the #defines above and more importantly, the
aux field in dev_ext_attribute since that is used in other places
too. Further, I suspect we'll still end up using the same or less
memory since we don't have that many boolean members within the MCA
code.


My main intention was to have all those in a single struct and use a
single store_bit/show_bit function.

Sure, you can do bools but this'll still be single variables spread
around in mce.c instead of one single struct mca_config which nicely
encapsulates all the configuration we do in the MCA code.

Or, you can modify the mca_config I have there and use bools and pass a
pointer to each actual bool member in each DEVICE_BIT_ATTR invocation
(and rename it to DEVICE_BOOL_ATTR). Yeah, that could work, unless I'm
missing something else, of course.


Yes, this is what I had in mind. Though your code for use of bitfield is 
nicely done, I felt use of boolean will fit better in this specific case.



Thanks,
Naveen

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [E1000-devel] [PATCH] RX initialization sequence fixed - enable RX after corresponding ring initialization only

2012-10-14 Thread Jeff Kirsher

On Sun, 2012-10-14 at 19:19 +0200, Dmitry Fleytman wrote:
> Reported-by: Chris Webb 
> Reported-by: Richard Davies 
> 
> Signed-off-by: Dmitry Fleytman 
> ---
>  drivers/net/ethernet/intel/e1000/e1000_ethtool.c |9 +
>  drivers/net/ethernet/intel/e1000/e1000_main.c|   18
> --
>  2 files changed, 21 insertions(+), 6 deletions(-) 

I will add it to my queue.  Thanks!


signature.asc
Description: This is a digitally signed message part

Re: [PATCH] cpufreq, powernow-k8: Remove usage of smp_processor_id() in preemptible code

2012-10-14 Thread Rafael J. Wysocki

> On Sunday 14 of October 2012 10:27:22 Rafael J. Wysocki wrote:
> > Hi,
> > 
> > Thanks for the patch!  I'll queue it up for v3.7 when I get back home from
> > the current trip (around the -rc3 time frame I suppose).
> > 
> > In future please don't send patches directly to sta...@vger.kernel.org.
> > That doesn't make -stable pick them up anyway and confuses things.
> 
> That happens anyway if you tag the patch for stable and use git
> send-email. Unless you go the extra mile and filter out the cc list,
> which is tedious.

Well, please don't tag patches for -stable, because -stable doesn't take
_patches_.  It takes commits from the Linus' tree and backports them and
that's maintainer's job to tag them for -stable, not yours.

You can give the maintainer a hint that you _think_ it's -stable material
(e.g. in the additional patch description that goes after the changelog),
but the maintainer may still disagree with you and may not tag the commit
for -stable after all.

> Besides, I'm pretty sure stable maintainers verify a patch is actually
> upstream before applying it anyway.

Yes, they do, but that means it doesn't make sense to send them stuff
before it's been merged, right?

Rafael

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/3 v2] Do not use acpi_device to find pci root bridge in _init code.

2012-10-14 Thread Taku Izumi

On Fri, 12 Oct 2012 20:34:20 +0800
Tang Chen  wrote:

> When the kernel is being initialized, and some hardwares are not added
> to system, there won't be acpi_device structs for these devices. But
> acpi_is_root_bridge() depends on acpi_device struct. As a result, all
> the not-added root bridge will not be judged as a root bridge in
> find_root_bridges(). And further more, no handle_hotplug_event_root()
> notifier will be installed for them.
> 
> This patch introduces a new api to find all root bridges in system by
> getting HID directly from ACPI namespace, not depending on acpi_device
> struct.

  How about squashing patch #2 into patch #1 ?
  The caller and callee should be the same place in my mind.

  Best regards,
  Taku Izumi

> Signed-off-by: Tang Chen 
> Signed-off-by: Liu Jiang 
> ---
>  drivers/acpi/pci_root.c |   19 +++
>  1 files changed, 11 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
> index 6151d83..582eb11 100644
> --- a/drivers/acpi/pci_root.c
> +++ b/drivers/acpi/pci_root.c
> @@ -129,20 +129,23 @@ EXPORT_SYMBOL_GPL(acpi_get_pci_rootbridge_handle);
>   * acpi_is_root_bridge - determine whether an ACPI CA node is a PCI root 
> bridge
>   * @handle - the ACPI CA node in question.
>   *
> - * Note: we could make this API take a struct acpi_device * instead, but
> - * for now, it's more convenient to operate on an acpi_handle.
> + * Note: If a device is not added to the system yet, there won't be an
> + * acpi_device struct for it. So do not get HID and CID from acpi_device,
> + * get them from ACPI namespace directly.
>   */
>  int acpi_is_root_bridge(acpi_handle handle)
>  {
> - int ret;
> - struct acpi_device *device;
> + struct acpi_device_info *info;
> + acpi_status status;
>  
> - ret = acpi_bus_get_device(handle, &device);
> - if (ret)
> + status = acpi_get_object_info(handle, &info);
> + if (ACPI_FAILURE(status)) {
> + printk(KERN_ERR PREFIX "%s: Error reading"
> +"device info\n", __func__);
>   return 0;
> + }
>  
> - ret = acpi_match_device_ids(device, root_device_ids);
> - if (ret)
> + if (acpi_match_object_info_ids(info, root_device_ids))
>   return 0;
>   else
>   return 1;
> -- 
> 1.7.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


-- 
Taku Izumi 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Bug fix] nfs-client: fix nfs_inode_attrs_need_update for async read_done comes during truncating to smaller size

2012-10-14 Thread Chen Gang

于 2012年10月15日 12:52, Chen Gang 写道:
>> Now, what are the conditions of your test setup? The above bug report is
>> > meaningless unless it includes a description of what is being exported
>> > by the server (including a proper listing of the contents
>> > of /etc/exports and /proc/mounts). It should also include a description
>> > of the NFS client mount options (see /proc/mounts on the client).

for exportfs command line is:
  rsh -n $RHOST "/usr/sbin/exportfs -i -o no_root_squash,rw *:$TESTDIR"

  $RHOST is dhcp122.asianux.net (10.1.0.122, not need input password)
  $TESTDIR just the mount dir.


> they are below, if you need additional information, please tell me again.
> 


-- 
Chen Gang

Asianux Corporation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RFC 2/6 v3] gpio: Add sysfs support to block GPIO API

2012-10-14 Thread Ryan Mallon

On 13/10/12 06:11, Roland Stigge wrote:
> This patch adds sysfs support to the block GPIO API.
> 
> Signed-off-by: Roland Stigge 

Hi Roland,

Some comments below,

~Ryan

> 
> ---
>  Documentation/ABI/testing/sysfs-gpio |6 
>  drivers/gpio/gpiolib.c   |  226 
> ++-
>  include/asm-generic/gpio.h   |   11 +
>  include/linux/gpio.h |   13 ++
>  4 files changed, 254 insertions(+), 2 deletions(-)
> 
> --- linux-2.6.orig/Documentation/ABI/testing/sysfs-gpio
> +++ linux-2.6/Documentation/ABI/testing/sysfs-gpio
> @@ -24,4 +24,8 @@ Description:
>   /base ... (r/o) same as N
>   /label ... (r/o) descriptive, not necessarily unique
>   /ngpio ... (r/o) number of GPIOs; numbered N to N + (ngpio - 1)
> -
> + /blockN ... for each GPIO block #N
> + /ngpio ... (r/o) number of GPIOs in this group
> + /exported ... sysfs export state of this group (0, 1)
> + /value ... current value as 32 or 64 bit integer in decimal
> +   (only available if /exported is 1)
> --- linux-2.6.orig/drivers/gpio/gpiolib.c
> +++ linux-2.6/drivers/gpio/gpiolib.c
> @@ -974,6 +974,218 @@ static void gpiochip_unexport(struct gpi
>   chip->label, status);
>  }
>  
> +static ssize_t gpio_block_ngpio_show(struct device *dev,
> +  struct device_attribute *attr, char *buf)
> +{
> + const struct gpio_block *block = dev_get_drvdata(dev);
> +
> + return sprintf(buf, "%u\n", block->ngpio);
> +}
> +static struct device_attribute
> +dev_attr_block_ngpio = __ATTR(ngpio, 0444, gpio_block_ngpio_show, NULL);

static DEVICE_ATTR(ngpio, S_IRUGO, gpio_block_ngpio_show, NULL);

> +
> +static ssize_t gpio_block_value_show(struct device *dev,
> +  struct device_attribute *attr, char *buf)
> +{
> + const struct gpio_block *block = dev_get_drvdata(dev);
> +
> + return sprintf(buf, "%u\n", gpio_block_get(block));

Printing the value of a bunch of pins as a decimal is a bit odd. Hex, or
a bitmap would be more appropriate.

> +}
> +
> +static bool gpio_block_is_output(struct gpio_block *block)
> +{
> + int i;
> +
> + for (i = 0; i < block->ngpio; i++)
> + if (!test_bit(FLAG_IS_OUT, &gpio_desc[block->gpio[i]].flags))
> + return false;

Shouldn't a block force all of the pins to be the same direction? Or at
least have gpio_block_set skip pins which aren't outputs.

> + return true;
> +}
> +
> +static ssize_t gpio_block_value_store(struct device *dev,
> +   struct device_attribute *attr,
> +   const char *buf, size_t size)
> +{
> + ssize_t status;
> + struct gpio_block   *block = dev_get_drvdata(dev);
> + unsigned long   value;
> +
> + mutex_lock(&sysfs_lock);
> +
> + status = kstrtoul(buf, 0, &value);
> + if (status == 0) {

You don't need to do the kstrtoul under the lock:

err = kstrtoul(buf, 0, &value);
if (err)
return err;

mutex_lock(&sysfs_lock);
...

Global lock is a bit lame, it serialises all of your bitbanged busses
against each other. Why is it not part of the gpio_block structure?


> + if (gpio_block_is_output(block)) {
> + gpio_block_set(block, value);
> + status = size;
> + } else {
> + status = -EPERM;
> + }
> + }
> +
> + mutex_unlock(&sysfs_lock);
> + return status;
> +}
> +
> +static struct device_attribute
> +dev_attr_block_value = __ATTR(value, 0644, gpio_block_value_show,
> +   gpio_block_value_store);

Use DEVICE_ATTR and S_IWUSR | S_IRUGO permission macros.

> +
> +static int gpio_block_value_is_exported(struct gpio_block *block)
> +{
> + struct device   *dev;
> + struct sysfs_dirent *sd = NULL;
> +
> + mutex_lock(&sysfs_lock);
> + dev = class_find_device(&gpio_class, NULL, block, match_export);
> + if (!dev)
> + goto out;
> +
> + sd = sysfs_get_dirent(dev->kobj.sd, NULL, "value");
> +
> +out:
> + mutex_unlock(&sysfs_lock);
> + return sd ? 1 : 0;

  return sd;

or:

  return !!sd;

> +}
> +
> +static ssize_t gpio_block_exported_show(struct device *dev,
> + struct device_attribute *attr,
> + char *buf)
> +{
> + struct gpio_block   *block = dev_get_drvdata(dev);
> +
> + return sprintf(buf, "%u\n", gpio_block_value_is_exported(block));
> +}
> +
> +static int gpio_block_value_export(struct gpio_block *block)
> +{
> + struct device   *dev;
> + int status;
> + int i;
> +
> + mutex_lock(&sysfs_lock);
> +
> + for (i = 0; i < block->ngpio; i++) {
> + status = gpio_request(block->gpio[i], "s

[PATCHv5 0/4] modem_shm: U8500 SHaRed Memory driver(SHRM)

2012-10-14 Thread Arun Murthy

In u8500 platform the communication between the APE(Application Processor) and
the modem subsystem(CMT) is by means of a shared DDR. The series of patches
include a protocol called ShaRed Memory(SHRM) protocol for communicating
between the APE and the CMT.
Interrupt generation registers in CMT and PRCMU on APE side are used to support
the shrm protocol.

v2 - Included netdev mailing list
v3 - Implemented comments from Alan Cox and Greg KH
v4 - Re-worked on the ModemAccessFramework(MAF) part
v5 - Added kernel doc for exported functions in MAF.

Arun Murthy (4):
  modem_shm: Add Modem Access Framework
  modem_shm: Register u8500 client for MAF
  modem_shm: u8500-shm: U8500 Shared Memory Driver
  Doc: Add u8500_shrm document

 Documentation/DocBook/Makefile  |2 +-
 Documentation/DocBook/shrm.tmpl |  125 +++
 Documentation/modem_shm/u8500_shrm.txt  |  254 +
 drivers/Kconfig |2 +
 drivers/Makefile|1 +
 drivers/modem_shm/Kconfig   |   22 +
 drivers/modem_shm/Makefile  |3 +
 drivers/modem_shm/modem_access.c|  413 +++
 drivers/modem_shm/modem_u8500.c |   96 ++
 drivers/modem_shm/u8500_shm/Kconfig |   43 +
 drivers/modem_shm/u8500_shm/Makefile|7 +
 drivers/modem_shm/u8500_shm/shrm.h  |   23 +
 drivers/modem_shm/u8500_shm/shrm_char.c |  816 ++
 drivers/modem_shm/u8500_shm/shrm_config.h   |  114 ++
 drivers/modem_shm/u8500_shm/shrm_driver.c   |  733 
 drivers/modem_shm/u8500_shm/shrm_driver.h   |  226 
 drivers/modem_shm/u8500_shm/shrm_fifo.c |  838 ++
 drivers/modem_shm/u8500_shm/shrm_ioctl.h|   43 +
 drivers/modem_shm/u8500_shm/shrm_net.c  |  313 ++
 drivers/modem_shm/u8500_shm/shrm_net.h  |   46 +
 drivers/modem_shm/u8500_shm/shrm_private.h  |  184 +++
 drivers/modem_shm/u8500_shm/shrm_protocol.c | 1591 +++
 include/linux/modem_shm/modem.h |   64 ++
 include/linux/modem_shm/modem_client.h  |   55 +
 24 files changed, 6013 insertions(+), 1 deletions(-)
 create mode 100644 Documentation/DocBook/shrm.tmpl
 create mode 100644 Documentation/modem_shm/u8500_shrm.txt
 create mode 100644 drivers/modem_shm/Kconfig
 create mode 100644 drivers/modem_shm/Makefile
 create mode 100644 drivers/modem_shm/modem_access.c
 create mode 100644 drivers/modem_shm/modem_u8500.c
 create mode 100644 drivers/modem_shm/u8500_shm/Kconfig
 create mode 100644 drivers/modem_shm/u8500_shm/Makefile
 create mode 100644 drivers/modem_shm/u8500_shm/shrm.h
 create mode 100644 drivers/modem_shm/u8500_shm/shrm_char.c
 create mode 100644 drivers/modem_shm/u8500_shm/shrm_config.h
 create mode 100644 drivers/modem_shm/u8500_shm/shrm_driver.c
 create mode 100644 drivers/modem_shm/u8500_shm/shrm_driver.h
 create mode 100644 drivers/modem_shm/u8500_shm/shrm_fifo.c
 create mode 100644 drivers/modem_shm/u8500_shm/shrm_ioctl.h
 create mode 100644 drivers/modem_shm/u8500_shm/shrm_net.c
 create mode 100644 drivers/modem_shm/u8500_shm/shrm_net.h
 create mode 100644 drivers/modem_shm/u8500_shm/shrm_private.h
 create mode 100644 drivers/modem_shm/u8500_shm/shrm_protocol.c
 create mode 100644 include/linux/modem_shm/modem.h
 create mode 100644 include/linux/modem_shm/modem_client.h

-- 
1.7.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCHv5 1/4] modem_shm: Add Modem Access Framework

2012-10-14 Thread Arun Murthy

Adds Modem Access Framework, which allows for registering platform specific
modem access mechanisms. The framework also exposes APIs for client drivers
for getting and releasing access to modem, regardless of the underlying
platform specific access mechanism.

Signed-off-by: Arun Murthy 
---
 drivers/Kconfig  |2 +
 drivers/Makefile |1 +
 drivers/modem_shm/Kconfig|9 ++
 drivers/modem_shm/Makefile   |1 +
 drivers/modem_shm/modem_access.c |  226 ++
 include/linux/modem_shm/modem.h  |   59 ++
 6 files changed, 298 insertions(+), 0 deletions(-)
 create mode 100644 drivers/modem_shm/Kconfig
 create mode 100644 drivers/modem_shm/Makefile
 create mode 100644 drivers/modem_shm/modem_access.c
 create mode 100644 include/linux/modem_shm/modem.h

diff --git a/drivers/Kconfig b/drivers/Kconfig
index ece958d..dc7c14a 100644
--- a/drivers/Kconfig
+++ b/drivers/Kconfig
@@ -152,4 +152,6 @@ source "drivers/vme/Kconfig"
 
 source "drivers/pwm/Kconfig"
 
+source "drivers/modem_shm/Kconfig"
+
 endmenu
diff --git a/drivers/Makefile b/drivers/Makefile
index 5b42184..902dfec 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -139,3 +139,4 @@ obj-$(CONFIG_EXTCON)+= extcon/
 obj-$(CONFIG_MEMORY)   += memory/
 obj-$(CONFIG_IIO)  += iio/
 obj-$(CONFIG_VME_BUS)  += vme/
+obj-$(CONFIG_MODEM_SHM)+= modem_shm/
diff --git a/drivers/modem_shm/Kconfig b/drivers/modem_shm/Kconfig
new file mode 100644
index 000..f4b7e54
--- /dev/null
+++ b/drivers/modem_shm/Kconfig
@@ -0,0 +1,9 @@
+config MODEM_SHM
+bool "Modem Access Framework"
+default n
+help
+ Add support for Modem Access Framework. It allows different
+platform specific drivers to register modem access mechanisms
+and allows transparent access to modem to the client drivers.
+
+If unsure, say N.
diff --git a/drivers/modem_shm/Makefile b/drivers/modem_shm/Makefile
new file mode 100644
index 000..b77bcc0
--- /dev/null
+++ b/drivers/modem_shm/Makefile
@@ -0,0 +1 @@
+obj-$(CONFIG_MODEM_SHM):= modem_access.o
diff --git a/drivers/modem_shm/modem_access.c b/drivers/modem_shm/modem_access.c
new file mode 100644
index 000..e06c51a
--- /dev/null
+++ b/drivers/modem_shm/modem_access.c
@@ -0,0 +1,226 @@
+/*
+ * Copyright (C) ST-Ericsson SA 2011
+ *
+ * License Terms: GNU General Public License v2
+ * Author: Kumar Sanghvi
+ * Arun Murthy 
+ *
+ * Heavily adapted from Regulator framework.
+ * Provides mechanisms for registering platform specific access
+ * mechanisms for modem.
+ * Also, exposes APIs for gettng/releasing the access and even
+ * query the access status, and the modem usage status.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static struct class *modem_class;
+
+static int __modem_get_usage(struct device *dev, void *data)
+{
+   struct modem_desc *mdesc = (struct modem_desc *)data;
+
+   if (!mdesc->mclients) {
+   printk(KERN_ERR "modem_access: modem description is NULL\n");
+   return 0;
+   }
+   return atomic_read(&mdesc->mclients->cnt);
+}
+
+/**
+ * modem_is_requested - check if modem access to APE is already enabled
+ * @mdesc: pointer to the struct modem_desc
+ *
+ * check if any of the client on ape has requested access to modem
+ * and return non-zero on success and zero on failure.
+ */
+int modem_get_usage(struct modem_desc *mdesc)
+{
+   return class_for_each_device(modem_class, NULL, (void *)mdesc, 
__modem_get_usage);
+}
+
+/**
+ * modem_is_requested - check if modem access to APE is already enabled
+ * @mdesc: pointer to the struct modem_desc
+ *
+ * check for a particular client if ape has requested access to modem
+ * and return non-zero on success and zero on failure.
+ */
+int modem_is_requested(struct modem_desc *mdesc)
+{
+   return atomic_read(&mdesc->mclients->cnt);
+}
+
+/**
+ * modem_release - disable modem access for APE
+ * @mdesc: pointer to the struct modem_desc
+ *
+ * disable modem access to the APE. For a particluar client it checks if modem
+ * has already been releases and if so returns else will call the platform
+ * specific function to disable access to modem.
+ */
+int modem_release(struct modem_desc *mdesc)
+{
+   if (!mdesc->release)
+   return -EFAULT;
+
+   if (modem_is_requested(mdesc)) {
+   atomic_dec(&mdesc->mclients->cnt);
+   if (atomic_read(&mdesc->use_cnt) == 1) {
+   mdesc->release(mdesc);
+   atomic_dec(&mdesc->use_cnt);
+   }
+   } else
+   printk(KERN_WARNING
+   "modem_shm: client %s has not requested modem to 
release\n",
+   mdesc->mclients->name);
+   return 0;
+}
+
+/**
+ * modem_request - enable modem access for APE
+ * @mdesc: pointer to t

[PATCHv5 4/4] Doc: Add u8500_shrm document

2012-10-14 Thread Arun Murthy

Add document for u8500 shared memory(shrm)and kernel Docbook.

Signed-off-by: Arun Murthy 
---
 Documentation/DocBook/Makefile |2 +-
 Documentation/DocBook/shrm.tmpl|  125 
 Documentation/modem_shm/u8500_shrm.txt |  254 
 3 files changed, 380 insertions(+), 1 deletions(-)
 create mode 100644 Documentation/DocBook/shrm.tmpl
 create mode 100644 Documentation/modem_shm/u8500_shrm.txt

diff --git a/Documentation/DocBook/Makefile b/Documentation/DocBook/Makefile
index bc3d9f8..673ea06 100644
--- a/Documentation/DocBook/Makefile
+++ b/Documentation/DocBook/Makefile
@@ -14,7 +14,7 @@ DOCBOOKS := z8530book.xml device-drivers.xml \
genericirq.xml s390-drivers.xml uio-howto.xml scsi.xml \
80211.xml debugobjects.xml sh.xml regulator.xml \
alsa-driver-api.xml writing-an-alsa-driver.xml \
-   tracepoint.xml drm.xml media_api.xml
+   tracepoint.xml drm.xml media_api.xml shrm.xml
 
 include $(srctree)/Documentation/DocBook/media/Makefile
 
diff --git a/Documentation/DocBook/shrm.tmpl b/Documentation/DocBook/shrm.tmpl
new file mode 100644
index 000..400f9b2
--- /dev/null
+++ b/Documentation/DocBook/shrm.tmpl
@@ -0,0 +1,125 @@
+
+http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd"; []>
+
+
+ 
+  Shared Memory
+  
+   
+Arun
+Murthy
+
+ 
+  arun.mur...@stericsson.com
+ 
+
+   
+  
+
+  
+   2009-2010
+   ST-Ericsson
+  
+
+  
+
+  Linux standard functions
+
+  
+
+  
+   
+   
+ Licence terms: GNU General Public Licence (GPL) version 2.
+   
+  
+ 
+
+
+  
+Introduction
+
+   This Documentation describes the ST-Ericsson's adaptation on protocol 
used for CMT/APE communication when SHaRedMemory is used as IPC link.
+
+  
+
+  
+Design
+
+   The APE consists Cortex A9 dual core SMP, a multimedia DSP and PRCMU. 
Modem consists of 2 Cortex R4 ARM processor.
+   The exchange of messages between CMT(Cellular Mobile Terminal) and APE 
includes copying the data to a shared area DDR.
+   This region is accessible by both CMT and APE. The design includes 2 
channels common and audio. Common channel is used for exchanging ISI, RPC and 
SECURITY messages.
+   udio channel is used for exchanging AUDIO messages. Each channel 
consists of 2 FIFO. One FIFO for sending message from CMT to APE and other from 
APE to CMT.
+   ach of these FIFO have write and read pointer shared between APE and 
CMT. Writer pointer is updated on copying the message to FIFO and reader will 
read the messages from the read pointer upto the writer pointer. Writer and 
reader notifications are used to notify the completion of read/write 
operation(seperate for APE and CMT).
+   river includes 4 queues. Once the messages are sent from CMT to APE it 
resides in the FIFO and then copied to one of the 4 queues based on the message 
type(ISI, RPC, AUDIO, SECURITY) and then the net/char device interface fetches 
this message from the queue and copies to the user space buffer.
+
+  
+
+  
+Concepts
+
+   The user space application sends ISI/RPC/AUDIO/SECURITY messages. ISI 
is sent through the phonet to shrm driver. For achieving this there are 2 
interfaces to the shrm driver. Net interface used for exchanging the ISI 
message and char interface for RPC, AUDIO and SECURITY messages. On receiving 
any of these messages from the user space application, it is copied to a memory 
in kernel space. From here it is then copied to respective FIFO from where the 
CMT reads the message.
+   CMT(Cellular Mobile Terminal) writes messages to the respective FIFO 
and thereafter to respective queue. The net/char device copies this message 
from the queue to the user space buffer.
+
+  
+
+  
+ Known Bugs And Assumptions
+  
+ 
+ 
+   None
+   
+ 
+   Assumptions
+   1. ApeShmFifo#0 is of 128kB in size. As this is used for 
transmission except CS audio call data. Expected message size is 1.5kB with a 
max of 16kB.
+   2. ApeShmFifo#1 is of 4kB in size. This is used for 
transmission of CS audio call data. Expected message size is 24kb.
+   3. CmtShmFifo#0 is of 128kB in size. As this is used for 
transmission except CS audio call data. Expected message size is 1.5kB with a 
max of 16kB.
+   4. CmtShmFifo#1 is of 4kB in size. This is used for 
transmission of CS audio call data. Expected message size is 24kb.
+   The total size of the FIFO is 264 kB.
+ 
+   
+ 
+ 
+  
+  
+
+  
+ Public Functions Provided
+ 
+   This Section lists the API's provided by the SHRM driver to phonet 
drivers.
+ 
+!Edrivers/modem_shm/u8500_shm/shrm_fifo.c
+ 
+   This Section lists the API's provided by the SHRM driver used in 
transmission of RPC, AUDIO and SECURITY messages.
+ 
+!Edrivers/modem_shm/u8500_shm/shrm_char.c
+

[PATCHv5 2/4] modem_shm: Register u8500 client for MAF

2012-10-14 Thread Arun Murthy

Register with Modem Access Framework(MAF) for u8500 platform. This will provide
interface to enable and disable modem access and also provide the status.

Signed-off-by: Arun Murthy 
---
 drivers/modem_shm/Kconfig   |   11 +
 drivers/modem_shm/Makefile  |1 +
 drivers/modem_shm/modem_u8500.c |   91 +++
 3 files changed, 103 insertions(+), 0 deletions(-)
 create mode 100644 drivers/modem_shm/modem_u8500.c

diff --git a/drivers/modem_shm/Kconfig b/drivers/modem_shm/Kconfig
index f4b7e54..f59d3dc 100644
--- a/drivers/modem_shm/Kconfig
+++ b/drivers/modem_shm/Kconfig
@@ -7,3 +7,14 @@ config MODEM_SHM
 and allows transparent access to modem to the client drivers.
 
 If unsure, say N.
+
+config MODEM_U8500
+   bool "Modem Access driver for STE U8500 platform"
+   depends on MODEM_SHM
+   default n
+   help
+Add support for Modem Access driver on STE U8500 platform which
+uses Shared Memroy as IPC mechanism between Modem processor and
+Application processor.
+
+If unsure, say N.
diff --git a/drivers/modem_shm/Makefile b/drivers/modem_shm/Makefile
index b77bcc0..a9aac0f 100644
--- a/drivers/modem_shm/Makefile
+++ b/drivers/modem_shm/Makefile
@@ -1 +1,2 @@
 obj-$(CONFIG_MODEM_SHM):= modem_access.o
+obj-$(CONFIG_MODEM_U8500)  += modem_u8500.o
diff --git a/drivers/modem_shm/modem_u8500.c b/drivers/modem_shm/modem_u8500.c
new file mode 100644
index 000..924b6a2
--- /dev/null
+++ b/drivers/modem_shm/modem_u8500.c
@@ -0,0 +1,91 @@
+/*
+ * Copyright (C) ST-Ericsson SA 2011
+ *
+ * License Terms: GNU General Public License v2
+ * Author: Kumar Sanghvi
+ * Arun Murthy 
+ *
+ * Platform driver implementing access mechanisms to modem
+ * on U8500 which uses Shared Memroy as IPC between Application
+ * Processor and Modem processor.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static int u8500_modem_request(struct modem_desc *mdesc)
+{
+   return prcmu_ac_wake_req();
+}
+
+static void u8500_modem_release(struct modem_desc *mdesc)
+{
+   prcmu_ac_sleep_req();
+}
+
+static int u8500_modem_is_requested(struct modem_desc *mdesc)
+{
+   return prcmu_is_ac_wake_requested();
+}
+
+static struct modem_desc u8500_modem_desc = {
+   .request = u8500_modem_request,
+   .release = u8500_modem_release,
+   .is_requested = u8500_modem_is_requested,
+   .name   = "u8500-shrm-modem",
+   .no_clients = 2,
+   .cli_cnt = ATOMIC_INIT(0),
+   .use_cnt = ATOMIC_INIT(0),
+};
+
+static int __devinit u8500_modem_probe(struct platform_device *pdev)
+{
+   int err = 0;
+
+   u8500_modem_desc.dev = &pdev->dev;
+   err = modem_register(&pdev->dev, &u8500_modem_desc);
+   if (err) {
+   pr_err("failed to register %s: err %i\n",
+   u8500_modem_desc.name, err);
+   }
+
+   return err;
+}
+
+static int __devexit u8500_modem_remove(struct platform_device *pdev)
+{
+   modem_unregister(&u8500_modem_desc);
+   return 0;
+}
+
+static struct platform_driver u8500_modem_driver = {
+   .driver = {
+   .name = "u8500-modem",
+   .owner = THIS_MODULE,
+   },
+   .probe = u8500_modem_probe,
+   .remove = __devexit_p(u8500_modem_remove),
+};
+
+static int __init u8500_modem_init(void)
+{
+   int ret;
+
+   ret = platform_driver_register(&u8500_modem_driver);
+   if (ret < 0) {
+   printk(KERN_ERR "u8500_modem: platform driver reg failed\n");
+   return -ENODEV;
+   }
+
+   return 0;
+}
+
+static void __exit u8500_modem_exit(void)
+{
+   platform_driver_unregister(&u8500_modem_driver);
+}
+
+arch_initcall(u8500_modem_init);
-- 
1.7.0.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH] perf: Add a few generic stalled-cycles events

2012-10-14 Thread Anshuman Khandual

On 10/12/2012 06:58 AM, Sukadev Bhattiprolu wrote:
> 
> From 89cb6a25b9f714e55a379467a832ee015014ed11 Mon Sep 17 00:00:00 2001
> From: Sukadev Bhattiprolu 
> Date: Tue, 18 Sep 2012 10:59:01 -0700
> Subject: [PATCH] perf: Add a few generic stalled-cycles events
> 
> The existing generic event 'stalled-cycles-backend' corresponds to
> PM_CMPLU_STALL event in Power7. While this event is useful, detailed
> performance analysis often requires us to find more specific reasons
> for the stalled cycle. For instance, stalled cycles in Power7 can
> occur due to, among others:
> 
>   - instruction fetch unit (IFU),
>   - Load-store-unit (LSU),
>   - Fixed point unit (FXU)
>   - Branch unit (BRU)
> 
> While it is possible to use raw codes to monitor these events, it quickly
> becomes cumbersome with performance analysis frequently requiring mapping
> the raw event codes in reports to their symbolic names.
> 
> This patch is a proposal to try and generalize such perf events. Since
> the code changes are quite simple, I bunched all the 4 events together.
> 
> I am not familiar with how readily these events would map to other
> architectures. Here is some information on the events for Power7:
> 
>   stalled-cycles-fixed-point (PM_CMPLU_STALL_FXU)
> 
>   Following a completion stall, the last instruction to finish
>   before completion resumes was from the Fixed Point Unit.
> 
>   Completion stall is any period when no groups completed and
>   the completion table was not empty for that thread.
> 
>   stalled-cycles-load-store (PM_CMPLU_STALL_LSU)
> 
>   Following a completion stall, the last instruction to finish
>   before completion resumes was from the Load-Store Unit.
> 
>   stalled-cycles-instruction-fetch (PM_CMPLU_STALL_IFU)
> 
>   Following a completion stall, the last instruction to finish
>   before completion resumes was from the Instruction Fetch Unit.
> 
>   stalled-cycles-branch (PM_CMPLU_STALL_BRU)
> 
>   Following a completion stall, the last instruction to finish
>   before completion resumes was from the Branch Unit.
> 
> Looking for feedback on this approach and if this can be further extended.
> Power7 has 530 events[2] out of which a "CPI stack analysis"[1] uses about 26
> events.
> 
> 
> [1] CPI Stack analysis
>   
> https://www.power.org/documentation/commonly-used-metrics-for-performance-analysis
> 
> [2] Power7 events:
>   
> https://www.power.org/documentation/comprehensive-pmu-event-reference-power7/

Here we should try to come up with a generic list of places in the processor 
where
the cycles can stall.

PERF_COUNT_HW_STALLED_CYCLES_FIXED_POINT
PERF_COUNT_HW_STALLED_CYCLES_LOAD_STORE
PERF_COUNT_HW_STALLED_CYCLES_INSTRUCTION_FETCH
PERF_COUNT_HW_STALLED_CYCLES_BRANCH
PERF_COUNT_HW_STALLED_CYCLES_
PERF_COUNT_HW_STALLED_CYCLES_
PERF_COUNT_HW_STALLED_CYCLES_
---

This generic list can be a superset which can accommodate all the architecture
giving the flexibility to implement selectively there after. Stall locations are
very important from CPI analysis stand point with real world use cases. This 
will
definitely help us in that direction.

Regards
Anshuman

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RFC 1/6 v3] gpio: Add a block GPIO API to gpiolib

2012-10-14 Thread Ryan Mallon

On 13/10/12 06:11, Roland Stigge wrote:
> The recurring task of providing simultaneous access to GPIO lines (especially
> for bit banging protocols) needs an appropriate API.
> 
> This patch adds a kernel internal "Block GPIO" API that enables simultaneous
> access to several GPIOs. This is done by abstracting GPIOs to an n-bit word:
> Once requested, it provides access to a group of GPIOs which can range over
> multiple GPIO chips.
> 
> Signed-off-by: Roland Stigge 

Hi Roland,

Some comments below.

~Ryan

> ---
> 
>  Documentation/gpio.txt |   45 +
>  drivers/gpio/gpiolib.c |  223 
> +
>  include/asm-generic/gpio.h |   14 ++
>  include/linux/gpio.h   |   61 
>  4 files changed, 343 insertions(+)
> 
> --- linux-2.6.orig/Documentation/gpio.txt
> +++ linux-2.6/Documentation/gpio.txt
> @@ -439,6 +439,51 @@ slower clock delays the rising edge of S
>  signaling rate accordingly.
>  
>  
> +Block GPIO
> +--
> +
> +The above described interface concentrates on handling single GPIOs.  
> However,
> +in applications where it is critical to set several GPIOs at once, this
> +interface doesn't work well, e.g. bit-banging protocols via grouped GPIO 
> lines.
> +Consider a GPIO controller that is connected via a slow I2C line. When
> +switching two or more GPIOs one after another, there can be considerable time
> +between those events. This is solved by an interface called Block GPIO:

The emulate behaviour of gpio block switches gpios one after the other.
Is the problem only solved if the block_get/block_set callbacks can be
implemented?

> +struct gpio_block *gpio_block_create(unsigned int *gpios, size_t size);
> +
> +This creates a new block of GPIOs as a list of GPIO numbers with the 
> specified
> +size which are accessible via the returned struct gpio_block and the accessor
> +functions described below. Please note that you need to request the GPIOs
> +separately via gpio_request(). An arbitrary list of globally valid GPIOs can 
> be
> +specified, even ranging over several gpio_chips. Actual handling of I/O
> +operations will be done on a best effort base, i.e. simultaneous I/O only 
> where
> +possible by hardware and implemented in the respective GPIO driver. The 
> number
> +of GPIOs in one block is limited to 32 on a 32 bit system, and 64 on a 64 bit
> +system. However, several blocks can be defined at once.
> +
> +unsigned gpio_block_get(struct gpio_block *block);
> +void gpio_block_set(struct gpio_block *block, unsigned value);
> +
> +With those accessor functions, setting and getting the GPIO values is 
> possible,
> +analogous to gpio_get_value() and gpio_set_value(). Each bit in the return
> +value of gpio_block_get() and in the value argument of gpio_block_set()
> +corresponds to a bit specified on gpio_block_create(). Block operations in
> +hardware are only possible where the respective GPIO driver implements it,
> +falling back to using single GPIO operations where the driver only implements
> +single GPIO access.
> +
> +void gpio_block_free(struct gpio_block *block);
> +
> +After the GPIO block isn't used anymore, it should be free'd via
> +gpio_block_free().
> +
> +int gpio_block_register(struct gpio_block *block);
> +void gpio_block_unregister(struct gpio_block *block);
> +
> +These functions can be used to register a GPIO block. Blocks registered this
> +way will be available via sysfs.
> +
> +
>  What do these conventions omit?
>  ===
>  One of the biggest things these conventions omit is pin multiplexing, since
> --- linux-2.6.orig/drivers/gpio/gpiolib.c
> +++ linux-2.6/drivers/gpio/gpiolib.c
> @@ -83,6 +83,10 @@ static inline void desc_set_label(struct
>  #endif
>  }
>  
> +#define NR_GPIO_BLOCKS   16
> +
> +static struct gpio_block *gpio_block[NR_GPIO_BLOCKS];
> +
>  /* Warn when drivers omit gpio_request() calls -- legal but ill-advised
>   * when setting direction, and otherwise illegal.  Until board setup code
>   * and drivers use explicit requests everywhere (which won't happen when
> @@ -1676,6 +1680,225 @@ void __gpio_set_value(unsigned gpio, int
>  }
>  EXPORT_SYMBOL_GPL(__gpio_set_value);
>  
> +static inline

Nitpick - don't need the inline, the compiler will do so for you.

> +int gpio_block_chip_index(struct gpio_block *block, struct gpio_chip *gc)

Should be static?

> +{
> + int i;
> +
> + for (i = 0; i < block->nchip; i++) {
> + if (block->gbc[i].gc == gc)
> + return i;
> + }
> + return -1;
> +}
> +
> +struct gpio_block *gpio_block_create(unsigned *gpios, size_t size,
> +  const char *name)
> +{
> + struct gpio_block *block;
> + struct gpio_block_chip *gbc;
> + struct gpio_remap *remap;
> + int i;
> +
> + if (size < 1 || size > sizeof(unsigned) * 8)
> + return ERR_PTR(-EINVAL);
> +
> + block = kzalloc(sizeof(struct gpio_block), GFP_KERNEL);
> +

[RFC PATCH 3/3] USB: forbid memory allocation with I/O during bus reset if storage interface exits

2012-10-14 Thread Ming Lei

If one storage interface exists in the device, memory allocation
with GFP_KERNEL during usb_device_reset() might trigger I/O transfer
on the storage interface itself and cause deadlock because the
'us->dev_mutex' is held in .pre_reset() and the storage interface
can't do I/O transfer when the reset is triggered by other
interface, or the error handling can't be completed if the reset
is triggered by mass storage itself(error handling path).

Cc: Alan Stern 
Cc: Oliver Neukum 
Signed-off-by: Ming Lei 
---
 drivers/usb/core/hub.c|   42 +-
 drivers/usb/storage/uas.c |4 
 drivers/usb/storage/usb.c |4 
 include/linux/usb.h   |1 +
 4 files changed, 50 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index 9dc8ff2..f6958f7 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -5004,6 +5004,33 @@ re_enumerate:
return -ENODEV;
 }
 
+static inline int is_bound_usb_storage_intf(struct usb_interface *intf)
+{
+   struct usb_driver *drv;
+
+   if (!intf->dev.driver)
+   return 0;
+
+   drv = to_usb_driver(intf->dev.driver);
+
+   return drv->for_storage;
+}
+
+static inline int has_bound_usb_storage_intf(struct usb_device *udev)
+{
+   struct usb_host_config *config = udev->actconfig;
+
+   if (config) {
+   int i;
+   for (i = 0; i < config->desc.bNumInterfaces; ++i) {
+   struct usb_interface *cintf = config->interface[i];
+   if (is_bound_usb_storage_intf(cintf))
+   return 1;
+   }
+   }
+   return 0;
+}
+
 /**
  * usb_reset_device - warn interface drivers and perform a USB port reset
  * @udev: device to reset (not in SUSPENDED or NOTATTACHED state)
@@ -5027,7 +5054,7 @@ re_enumerate:
 int usb_reset_device(struct usb_device *udev)
 {
int ret;
-   int i;
+   int i, no_io = 0;
struct usb_host_config *config = udev->actconfig;
 
if (udev->state == USB_STATE_NOTATTACHED ||
@@ -5037,6 +5064,15 @@ int usb_reset_device(struct usb_device *udev)
return -EINVAL;
}
 
+   /*
+* If bound mass storage interface exists, don't allocate memory
+* with GFP_KERNEL in current context to avoid possible deadlock
+*/
+   if (has_bound_usb_storage_intf(udev)) {
+   no_io = 1;
+   tsk_memalloc_forbid_io(current);
+   }
+
/* Prevent autosuspend during the reset */
usb_autoresume_device(udev);
 
@@ -5081,6 +5117,10 @@ int usb_reset_device(struct usb_device *udev)
}
 
usb_autosuspend_device(udev);
+
+   if (no_io)
+   tsk_memalloc_allow_io(current);
+
return ret;
 }
 EXPORT_SYMBOL_GPL(usb_reset_device);
diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c
index 98b98ee..89daecb 100644
--- a/drivers/usb/storage/uas.c
+++ b/drivers/usb/storage/uas.c
@@ -888,6 +888,10 @@ static int uas_probe(struct usb_interface *intf, const 
struct usb_device_id *id)
struct Scsi_Host *shost;
struct uas_dev_info *devinfo;
struct usb_device *udev = interface_to_usbdev(intf);
+   struct usb_driver *drv = to_usb_driver(intf->dev.driver);
+
+   if (!drv->for_storage)
+   drv->for_storage = 1;
 
if (uas_switch_interface(udev, intf))
return -ENODEV;
diff --git a/drivers/usb/storage/usb.c b/drivers/usb/storage/usb.c
index 12aa726..f11ed09 100644
--- a/drivers/usb/storage/usb.c
+++ b/drivers/usb/storage/usb.c
@@ -905,9 +905,13 @@ int usb_stor_probe1(struct us_data **pus,
struct Scsi_Host *host;
struct us_data *us;
int result;
+   struct usb_driver *drv = to_usb_driver(intf->dev.driver);
 
US_DEBUGP("USB Mass Storage device detected\n");
 
+   if (!drv->for_storage)
+   drv->for_storage = 1;
+
/*
 * Ask the SCSI layer to allocate a host structure, with extra
 * space at the end for our private us_data structure.
diff --git a/include/linux/usb.h b/include/linux/usb.h
index 07915a3..3216b00 100644
--- a/include/linux/usb.h
+++ b/include/linux/usb.h
@@ -1024,6 +1024,7 @@ struct usb_driver {
unsigned int supports_autosuspend:1;
unsigned int disable_hub_initiated_lpm:1;
unsigned int soft_unbind:1;
+   unsigned int for_storage:1;
 };
 #defineto_usb_driver(d) container_of(d, struct usb_driver, 
drvwrap.driver)
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 2/3] PM / Runtime: force memory allocation with no I/O during runtime_resume callbcack

2012-10-14 Thread Ming Lei

This patch applies the introduces tsk_memalloc_forbid_io() and
tsk_memalloc_allow_io() to force memory allocation with no I/O
during runtime_resume callback.

Cc: Alan Stern 
Cc: Oliver Neukum 
Cc: Rafael J. Wysocki 
Signed-off-by: Ming Lei 
---
 drivers/base/power/runtime.c |   13 +
 1 file changed, 13 insertions(+)

diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index 3148b10..76836c1 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -652,7 +652,20 @@ static int rpm_resume(struct device *dev, int rpmflags)
if (!callback && dev->driver && dev->driver->pm)
callback = dev->driver->pm->runtime_resume;
 
+   /*
+* Deadlock might be caused if memory allocation with GFP_KERNEL
+* happens inside runtime_resume callback of one block device's
+* ancestor or the block device itself. The easiest approach is
+* to forbid I/O inside runtime_resume of all devices.
+*
+* In fact, it can be done only if the deivce is a block device
+* or there is one block device descendant. But that may become
+* complicated and not efficient because device tree traversing
+* is involved.
+*/
+   tsk_memalloc_forbid_io(current);
retval = rpm_callback(callback, dev);
+   tsk_memalloc_allow_io(current);
if (retval) {
__update_runtime_status(dev, RPM_SUSPENDED);
pm_runtime_cancel_pending(dev);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 1/3] mm: teach mm by current context info to not do I/O during memory allocation

2012-10-14 Thread Ming Lei

This patch introduces PF_MEMALLOC_NOIO on process flag('flags' field of
'struct task_struct'), so that the flag can be set by one task
to avoid doing I/O inside memory allocation in the task's context.

The patch trys to solve one deadlock problem caused by block device,
and the problem can be occured at least in the below situations:

- during block device runtime resume situation, if memory allocation
with GFP_KERNEL is called inside runtime resume callback of any one
of its ancestors(or the block device itself), the deadlock may be
triggered inside the memory allocation since it might not complete
until the block device becomes active and the involed page I/O finishes.
The situation is pointed out first by Alan Stern. It is not a good
approach to convert all GFP_KERNEL in the path into GFP_NOIO because
several subsystems may be involved(for example, PCI, USB and SCSI may
be involved for usb mass stoarage device)

- during error handling situation of usb mass storage deivce, USB
bus reset will be put on the device, so there shouldn't have any
memory allocation with GFP_KERNEL during USB bus reset, otherwise
the deadlock similar with above may be triggered. Unfortunately, any
usb device may include one mass storage interface in theory, so it
requires all usb interface drivers to handle the situation. In fact,
most usb drivers don't know how to handle bus reset on the device
and don't provide .pre_set() and .post_reset() callback at all, so
USB core has to unbind and bind driver for these devices. So it
is still not practical to resort to GFP_NOIO for solving the problem.

Also the introduced solution can be used by block subsystem or block
drivers too, for example, set the PF_MEMALLOC_NOIO flag before doing
actual I/O transfer.

Cc: Alan Stern 
Cc: Oliver Neukum 
Cc: Jiri Kosina 
Cc: Andrew Morton 
Cc: Mel Gorman 
Cc: Minchan Kim 
Cc: KAMEZAWA Hiroyuki 
Cc: Michal Hocko 
Cc: Ingo Molnar 
Cc: Peter Zijlstra 
Cc: "Rafael J. Wysocki" 
Cc: linux-mm 

Signed-off-by: Ming Lei 
---
 include/linux/sched.h |5 +
 mm/page_alloc.c   |2 ++
 2 files changed, 7 insertions(+)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index f6961c9..33be290 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1811,6 +1811,7 @@ extern void thread_group_times(struct task_struct *p, 
cputime_t *ut, cputime_t *
 #define PF_FROZEN  0x0001  /* frozen for system suspend */
 #define PF_FSTRANS 0x0002  /* inside a filesystem transaction */
 #define PF_KSWAPD  0x0004  /* I am kswapd */
+#define PF_MEMALLOC_NOIO 0x0008/* Allocating memory without IO 
involved */
 #define PF_LESS_THROTTLE 0x0010/* Throttle me less: I clean memory */
 #define PF_KTHREAD 0x0020  /* I am a kernel thread */
 #define PF_RANDOMIZE   0x0040  /* randomize virtual address space */
@@ -1848,6 +1849,10 @@ extern void thread_group_times(struct task_struct *p, 
cputime_t *ut, cputime_t *
 #define tsk_used_math(p) ((p)->flags & PF_USED_MATH)
 #define used_math() tsk_used_math(current)
 
+#define tsk_memalloc_no_io(p) ((p)->flags & PF_MEMALLOC_NOIO)
+#define tsk_memalloc_allow_io(p) do { (p)->flags &= ~PF_MEMALLOC_NOIO; } while 
(0)
+#define tsk_memalloc_forbid_io(p) do { (p)->flags |= PF_MEMALLOC_NOIO; } while 
(0)
+
 /*
  * task->jobctl flags
  */
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 8e1be1c..e15381f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2596,6 +2596,8 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order,
int alloc_flags = ALLOC_WMARK_LOW|ALLOC_CPUSET;
 
gfp_mask &= gfp_allowed_mask;
+   if (unlikely(tsk_memalloc_no_io(current)))
+   gfp_mask &= ~GFP_IOFS;
 
lockdep_trace_alloc(gfp_mask);
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 0/3] mm/PM/USB: force memory allocation with no io in need

2012-10-14 Thread Ming Lei

Hi,

This patch set introduces one process flag and trys to fix one deadlock
problem on block device during runtime resume or usb bus reset.

The 1st one is the change on include/sched.h and mm/page_alloc.c.

The other 2 patches are applied again PM and USB subsystem to demo
how to use the introduced mechanism to fix the deadlock problem.

 drivers/base/power/runtime.c |   13 +
 drivers/usb/core/hub.c   |   42 +-
 drivers/usb/storage/uas.c|4 
 drivers/usb/storage/usb.c|4 
 include/linux/sched.h|5 +
 include/linux/usb.h  |1 +
 mm/page_alloc.c  |2 ++


Thanks,
--
Ming Lei

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RFC 2/2] [x86] Optimize copy_page by re-arranging instruction sequence and saving register

2012-10-14 Thread George Spelvin

Just for everyone's information, here's the updated benchmark code on
the same Phenom.  The REP MOVSQ code is indeed much faster.

vendor_id   : AuthenticAMD
cpu family  : 16
model   : 2
model name  : AMD Phenom(tm) 9850 Quad-Core Processor
stepping: 3
microcode   : 0x183
cpu MHz : 2500.210
cache size  : 512 KB
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb 
rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc extd_apicid pni 
monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a 
misalignsse 3dnowprefetch osvw ibs hw_pstate npt lbrv svm_lock
bogomips: 5000.42
TLB size: 1024 4K pages
clflush size: 64
cache_alignment : 64

copy_page_org   copy_page_new   REP 
MOVSQ
TPT: Len 4096, alignment  0/ 0: 708 758 611
TPT: Len 4096, alignment  0/ 0: 672 758 611
TPT: Len 4096, alignment  0/ 0: 708 758 611
TPT: Len 4096, alignment  0/ 0: 708 758 611
TPT: Len 4096, alignment  0/ 0: 708 758 611
copy_page_org   copy_page_new   REP 
MOVSQ
TPT: Len 4096, alignment  0/ 0: 694 759 611
TPT: Len 4096, alignment  0/ 0: 672 758 611
TPT: Len 4096, alignment  0/ 0: 708 758 611
TPT: Len 4096, alignment  0/ 0: 708 759 611
TPT: Len 4096, alignment  0/ 0: 708 757 611
copy_page_org   copy_page_new   REP 
MOVSQ
TPT: Len 4096, alignment  0/ 0: 697 758 611
TPT: Len 4096, alignment  0/ 0: 708 758 611
TPT: Len 4096, alignment  0/ 0: 708 758 611
TPT: Len 4096, alignment  0/ 0: 708 758 611
TPT: Len 4096, alignment  0/ 0: 708 757 611
copy_page_org   copy_page_new   REP 
MOVSQ
TPT: Len 4096, alignment  0/ 0: 703 758 612
TPT: Len 4096, alignment  0/ 0: 709 758 611
TPT: Len 4096, alignment  0/ 0: 709 757 611
TPT: Len 4096, alignment  0/ 0: 709 759 613
TPT: Len 4096, alignment  0/ 0: 709 759 611
copy_page_org   copy_page_new   REP 
MOVSQ
TPT: Len 4096, alignment  0/ 0: 669 758 613
TPT: Len 4096, alignment  0/ 0: 671 758 611
TPT: Len 4096, alignment  0/ 0: 708 758 611
TPT: Len 4096, alignment  0/ 0: 708 758 611
TPT: Len 4096, alignment  0/ 0: 708 758 613
copy_page_org   copy_page_new   REP 
MOVSQ
TPT: Len 4096, alignment  0/ 0: 679 758 612
TPT: Len 4096, alignment  0/ 0: 671 758 612
TPT: Len 4096, alignment  0/ 0: 684 759 612
TPT: Len 4096, alignment  0/ 0: 709 759 613
TPT: Len 4096, alignment  0/ 0: 709 759 611
copy_page_org   copy_page_new   REP 
MOVSQ
TPT: Len 4096, alignment  0/ 0: 682 758 612
TPT: Len 4096, alignment  0/ 0: 673 758 613
TPT: Len 4096, alignment  0/ 0: 704 759 613
TPT: Len 4096, alignment  0/ 0: 709 758 613
TPT: Len 4096, alignment  0/ 0: 709 758 611
copy_page_org   copy_page_new   REP 
MOVSQ
TPT: Len 4096, alignment  0/ 0: 669 759 611
TPT: Len 4096, alignment  0/ 0: 671 759 611
TPT: Len 4096, alignment  0/ 0: 709 759 613
TPT: Len 4096, alignment  0/ 0: 709 759 613
TPT: Len 4096, alignment  0/ 0: 708 759 613
copy_page_org   copy_page_new   REP 
MOVSQ
TPT: Len 4096, alignment  0/ 0: 668 759 612
TPT: Len 4096, alignment  0/ 0: 709 759 612
TPT: Len 4096, alignment  0/ 0: 709 759 612
TPT: Len 4096, alignment  0/ 0: 709 759 612
TPT: Len 4096, alignment  0/ 0: 709 759 6

RE: [PATCH RFC 2/2] [x86] Optimize copy_page by re-arranging instruction sequence and saving register

2012-10-14 Thread Ma, Ling

Thanks Boris!
So the patch is helpful and no impact for other/older machines,
I will re-send new version according to comments.
Any further comments are appreciated!

Regards
Ling

> -Original Message-
> From: Borislav Petkov [mailto:b...@alien8.de]
> Sent: Sunday, October 14, 2012 6:58 PM
> To: Ma, Ling
> Cc: Konrad Rzeszutek Wilk; mi...@elte.hu; h...@zytor.com;
> t...@linutronix.de; linux-kernel@vger.kernel.org; i...@google.com;
> George Spelvin
> Subject: Re: [PATCH RFC 2/2] [x86] Optimize copy_page by re-arranging
> instruction sequence and saving register
> 
> On Fri, Oct 12, 2012 at 08:04:11PM +0200, Borislav Petkov wrote:
> > Right, so benchmark shows around 20% speedup on Bulldozer but this is
> > a microbenchmark and before pursue this further, we need to verify
> > whether this brings any palpable speedup with a real benchmark, I
> > don't know, kernbench, netbench, whatever. Even something as boring
> as
> > kernel build. And probably check for perf regressions on the rest of
> > the uarches.
> 
> Ok, so to summarize, on AMD we're using REP MOVSQ which is even faster
> than the unrolled version. I've added the REP MOVSQ version to the
> µbenchmark. It nicely validates that we're correctly setting
> X86_FEATURE_REP_GOOD on everything >= F10h and some K8s.
> 
> So, to answer Konrad's question: those patches don't concern AMD
> machines.
> 
> Thanks.
> 
> --
> Regards/Gruss,
> Boris.

Re: [GIT PULL] Disintegrate UAPI for sh [ver #2]

2012-10-14 Thread Paul Mundt

On Tue, Oct 09, 2012 at 10:15:57AM +0100, David Howells wrote:
> Can you merge the following branch into the sh tree please.
> 
> This is to complete part of the UAPI disintegration for which the preparatory
> patches were pulled recently.
> 
> Now that the fixups and the asm-generic chunk have been merged, I've
> regenerated the patches to get rid of those dependencies and to take account 
> of
> any changes made so far in the merge window.  If you have already pulled the
> older version of the branch aimed at you, then please feel free to ignore this
> request.
> 
> The following changes since commit 9e2d8656f5e8aa214e66b462680cf86b210b74a8:
> 
>   Merge branch 'akpm' (Andrew's patch-bomb) (2012-10-09 16:23:15 +0900)
> 
> are available in the git repository at:
> 
> 
>   git://git.infradead.org/users/dhowells/linux-headers.git 
> tags/disintegrate-sh-20121009
> 
Pulled, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Bug fix] nfs-client: fix nfs_inode_attrs_need_update for async read_done comes during truncating to smaller size

2012-10-14 Thread Chen Gang

于 2012年10月15日 12:27, Myklebust, Trond 写道:
> nfs_size_need_update is not about performance. It is a heuristic that is
> entirely about ensuring correctness when faced with the fact that most
> Linux filesystems are utterly incapable of reporting with modifications
> that occur within < 1 second intervals because their mtime/ctime is
> limited to 1 second resolutions.
> 

if truly it was for correctness, why not use "!=" instead of '>' ?

> Now, what are the conditions of your test setup? The above bug report is
> meaningless unless it includes a description of what is being exported
> by the server (including a proper listing of the contents
> of /etc/exports and /proc/mounts). It should also include a description
> of the NFS client mount options (see /proc/mounts on the client).

they are below, if you need additional information, please tell me again.

for server:
(nfsx-linux using rsh auto exportfs in cmd line, not in /etc/exports)

root@dhcp122:~# exportfs
/tmp/fsx18251.testdir

/tmp
root@dhcp122:~#
root@dhcp122:~# cat /etc/exports
# /etc/exports: the access control list for filesystems which may be
exported
#   to NFS clients.  See exports(5).
#
# Example for NFSv2 and NFSv3:
# /srv/homes   hostname1(rw,sync,no_subtree_check)
hostname2(ro,sync,no_subtree_check)
#
# Example for NFSv4:
# /srv/nfs4gss/krb5i(rw,sync,fsid=0,crossmnt,no_subtree_check)
# /srv/nfs4/homes  gss/krb5i(rw,sync,no_subtree_check)
#
/tmp *(rw,sync,no_root_squash,no_subtree_check)
root@dhcp122:~#
root@dhcp122:~# cat /proc/mounts
rootfs / rootfs rw 0 0
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
udev /dev devtmpfs rw,relatime,size=1229628k,nr_inodes=189901,mode=755 0 0
devpts /dev/pts devpts
rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,nosuid,relatime,size=516280k,mode=755 0 0
/dev/disk/by-uuid/e843c57e-98ce-44cc-8e02-6d8e8d8a01b6 / ext4
rw,relatime,errors=remount-ro,data=ordered 0 0
cgroup /sys/fs/cgroup tmpfs rw,relatime,mode=755 0 0
cgroup /sys/fs/cgroup/cpuset cgroup rw,relatime,cpuset 0 0
cgroup /sys/fs/cgroup/cpu cgroup rw,relatime,cpu 0 0
cgroup /sys/fs/cgroup/cpuacct cgroup rw,relatime,cpuacct 0 0
cgroup /sys/fs/cgroup/devices cgroup rw,relatime,devices 0 0
cgroup /sys/fs/cgroup/freezer cgroup rw,relatime,freezer 0 0
cgroup /sys/fs/cgroup/blkio cgroup rw,relatime,blkio 0 0
cgroup /sys/fs/cgroup/perf_event cgroup rw,relatime,perf_event 0 0
none /sys/fs/fuse/connections fusectl rw,relatime 0 0
none /sys/kernel/debug debugfs rw,relatime 0 0
none /sys/kernel/security securityfs rw,relatime 0 0
none /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k 0 0
none /run/shm tmpfs rw,nosuid,nodev,relatime 0 0
rpc_pipefs /run/rpc_pipefs rpc_pipefs rw,relatime 0 0
nfsd /proc/fs/nfsd nfsd rw,relatime 0 0
root@dhcp122:~#
---

for client:
---

root@dhcp159:/opt/ltp/testscripts# cat /proc/mounts
rootfs / rootfs rw 0 0
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
udev /dev devtmpfs rw,relatime,size=1103700k,nr_inodes=190392,mode=755 0 0
devpts /dev/pts devpts
rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,nosuid,relatime,size=465908k,mode=755 0 0
/dev/disk/by-uuid/418ec1f1-ed9d-4cae-9336-6c742accf538 / ext4
rw,relatime,errors=remount-ro,data=ordered 0 0
cgroup /sys/fs/cgroup tmpfs rw,relatime,mode=755 0 0
cgroup /sys/fs/cgroup/cpuset cgroup rw,relatime,cpuset 0 0
cgroup /sys/fs/cgroup/cpu cgroup rw,relatime,cpu 0 0
cgroup /sys/fs/cgroup/cpuacct cgroup rw,relatime,cpuacct 0 0
cgroup /sys/fs/cgroup/devices cgroup rw,relatime,devices 0 0
cgroup /sys/fs/cgroup/freezer cgroup rw,relatime,freezer 0 0
cgroup /sys/fs/cgroup/blkio cgroup rw,relatime,blkio 0 0
cgroup /sys/fs/cgroup/perf_event cgroup rw,relatime,perf_event 0 0
none /sys/fs/fuse/connections fusectl rw,relatime 0 0
none /sys/kernel/debug debugfs rw,relatime 0 0
none /sys/kernel/security securityfs rw,relatime 0 0
none /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k 0 0
none /run/shm tmpfs rw,nosuid,nodev,relatime 0 0
rpc_pipefs /run/rpc_pipefs rpc_pipefs rw,relatime 0 0
nfsd /proc/fs/nfsd nfsd rw,relatime 0 0
/dev/sda1 /mnt/sda1 ext3
rw,relatime,errors=continue,user_xattr,acl,barrier=1,data=ordered 0 0
dhcp122.asianux.net:/tmp/fsx18251.testdir/
/opt/ltp/testcases/bin/fsx18251 nfs
rw,relatime,vers=2,rsize=8192,wsize=8192,namlen=255,hard,proto=udp,timeo=11,retrans=3,sec=sys,mountaddr=10.1.0.139,mountvers=1,mountport=39973,mountproto=udp,local_lock=none,addr=10.1.0.139
0 0
root@dhcp159:/opt/ltp/testscripts#


-- 
Chen Gang

Asianux Corporation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to

Re: [PATCH V2 0/3] Create sched_select_cpu() and use it in workqueues

2012-10-14 Thread Viresh Kumar

On 27 September 2012 14:34, Viresh Kumar  wrote:
> This is V2 of my sched_select_cpu() work.
>
> In order to save power, it would be useful to schedule work onto non-IDLE cpus
> instead of waking up an IDLE one.
>
> To achieve this, we need scheduler to guide kernel frameworks (like: timers &
> workqueues) on which is the most preferred CPU that must be used for these
> tasks.
>
> This patchset is about implementing this concept.
>
> - The first patch adds sched_select_cpu() routine which returns the preferred
>   cpu which is non-idle.
> - Second patch removes idle_cpu() calls from timer & hrtimer.
> - Third patch is about adapting this change in workqueue framework.
>
> Earlier discussions over v1 can be found here:
> http://www.mail-archive.com/linaro-dev@lists.linaro.org/msg13342.html
>
> Earlier discussions over this concept were done at last LPC:
> http://summit.linuxplumbersconf.org/lpc-2012/meeting/90/lpc2012-sched-timer-workqueue/
>
> Module created for testing this behavior is present here:
> http://git.linaro.org/gitweb?p=people/vireshk/module.git;a=summary
>
> Following are the steps followed in test module:
> 1. Run single work on each cpu
> 2. This work will start a timer after x (tested with 10) jiffies of delay
> 3. Timer routine queues a work... (This may be called from idle or non-idle 
> cpu)
>and starts the same timer again STEP 3 is done for n number of times (i.e.
>queuing n works, one after other)
> 4. All works will call a single routine, which will count following per cpu:
>  - Total works processed by a CPU
>  - Total works processed by a CPU, which are queued from it
>  - Total works processed by a CPU, which aren't queued from it
>
> Setup:
> -
> - ARM Vexpress TC2 - big.LITTLE CPU
> - Core 0-1: A15, 2-4: A7
> - rootfs: linaro-ubuntu-nano
>
> Results:
> ---
> Without Workqueue Modification, i.e. PATCH 3/3:
> [ 2493.022335] Workqueue Analyser: works processsed by CPU0, Total: 1000, 
> Own: 0, migrated: 0
> [ 2493.047789] Workqueue Analyser: works processsed by CPU1, Total: 1000, 
> Own: 0, migrated: 0
> [ 2493.072918] Workqueue Analyser: works processsed by CPU2, Total: 1000, 
> Own: 0, migrated: 0
> [ 2493.098576] Workqueue Analyser: works processsed by CPU3, Total: 1000, 
> Own: 0, migrated: 0
> [ 2493.123702] Workqueue Analyser: works processsed by CPU4, Total: 1000, 
> Own: 0, migrated: 0
>
> With Workqueue Modification, i.e. PATCH 3/3:
> [ 2493.022335] Workqueue Analyser: works processsed by CPU0, Total: 1002, 
> Own: 999, migrated: 3
> [ 2493.047789] Workqueue Analyser: works processsed by CPU1, Total: 998,  
> Own: 997, migrated: 1
> [ 2493.072918] Workqueue Analyser: works processsed by CPU2, Total: 1013, 
> Own: 996, migrated: 17
> [ 2493.098576] Workqueue Analyser: works processsed by CPU3, Total: 998,  
> Own: 993, migrated: 5
> [ 2493.123702] Workqueue Analyser: works processsed by CPU4, Total: 989,  
> Own: 987, migrated: 2
>
> V1->V2
> -
> - New SD_* macros removed now and earlier ones used
> - sched_select_cpu() rewritten and it includes the check on current cpu's
>   idleness.
> - cpu_idle() calls from timer and hrtimer removed now.
> - Patch 2/3 from V1, removed as it doesn't apply to latest workqueue branch 
> from
>   tejun.
> - CONFIG_MIGRATE_WQ removed and so is wq_select_cpu()
> - sched_select_cpu() called only from __queue_work()
> - got tejun/for-3.7 branch in my tree, before making workqueue changes.
>
> Viresh Kumar (3):
>   sched: Create sched_select_cpu() to give preferred CPU for power
> saving
>   timer: hrtimer: Don't check idle_cpu() before calling
> get_nohz_timer_target()
>   workqueue: Schedule work on non-idle cpu instead of current one

Hi Guys,

I totally understand since last few weeks you guys were very busy as the
merge window was around. So, didn't tried to disturb you then :)

Can you please share your viewpoint on this patchset now? And also
the running timer migration patch (which was sent separately)?

Thanks in Advance.

--
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Bug fix] nfs-client: fix nfs_inode_attrs_need_update for async read_done comes during truncating to smaller size

2012-10-14 Thread Myklebust, Trond

On Mon, 2012-10-15 at 10:12 +0800, Chen Gang wrote:
> Hello Trond Myklebust, Jeff Layton:
> 
> 1) Root Cause:
>A) begin truncate to smaller, after async read finish starting.
>B) async read done come, after truncate operation change inode size.
>C) in nfs_inode_attrs_need_update, nfs_size_need_update return true.
>   i)   the bigger size is the original old size of client itself.
>   ii)  the smaller size is the current true size.
>   iii) nfs_inode_attrs_need_update not consider this situation.
> 
> 2) Fix nfs_size_need_update:
>A) delete it:
>   i)   it is for performance, not necessary (not for correctness).
>   ii)  if it was necessary, it should use "!=" instead of '>'.
>   iii) it is the simplest way to fix this bug (maybe not best way).
>B) consider this situation in it:
>   i)   it is the best way.
>   ii)  it is a little complex (need think of)
>   iii) sorry for I do not know how to fix it (at least now).
>C) not touch it:
>   i)   correct another place (such as nfs_update_inode)
>   ii)  it is a bad idea (at least, I think it is)
>   iii) we need keep the source code as clearer as possible.
> 
> 3) Test Result:
>A) it is one client and one server separately, under 3.6-rc5 x86_32.
>B) use one process (fsx-linux) test (only one user mode thread).
>C) only use read, truncate, llseek, fstat operation for one file.
> 
>Before delete nfs_size_need_update, it causes issue.
>After delete nfs_size_need_update, it is ok.

nfs_size_need_update is not about performance. It is a heuristic that is
entirely about ensuring correctness when faced with the fact that most
Linux filesystems are utterly incapable of reporting with modifications
that occur within < 1 second intervals because their mtime/ctime is
limited to 1 second resolutions.

Now, what are the conditions of your test setup? The above bug report is
meaningless unless it includes a description of what is being exported
by the server (including a proper listing of the contents
of /etc/exports and /proc/mounts). It should also include a description
of the NFS client mount options (see /proc/mounts on the client).

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
trond.mykleb...@netapp.com
www.netapp.com

Re: Linux 3.7-rc1 (uml uapi errors)

2012-10-14 Thread Alex Shi

>
> Building um (uml) for x86_64 (defconfig) has lots of errors like:
>
> In file included from include/linux/irq.h:22:0,
>  from include/asm-generic/hardirq.h:12,
>  from arch/um/include/generated/asm/hardirq.h:1,
>  from include/linux/hardirq.h:7,
>  from include/linux/ftrace_event.h:7,
>  from include/trace/syscall.h:6,
>  from include/linux/syscalls.h:78,
>  from init/noinitramfs.c:23:
> include/linux/irqnr.h:4:30: fatal error: uapi/linux/irqnr.h: No such file or 
> directory
>

Bisect said the commit 244acb1ba3777c2eb4d33ddc246cab5419656442 cause
this issue.

-- 
Thanks
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Hardcoded instruction causes certain features to fail on ARM platfrom due to endianness

2012-10-14 Thread Yangfei (Felix)

Hi all,

I found that hardcoded instruction in inline asm can cause certains certain 
features fail to work on ARM platform due to endianness.
As an example, consider the following code snippet of platform_do_lowpower 
function from arch/arm/mach-realview/hotplug.c:
/ *
 * here's the WFI
 */
asm(".word  0xe320f003\n"
:
:
: "memory", "cc");

The instruction generated from this inline asm will not work on big-endian 
ARM platform, such as ARM BE-8 format. Instead, an exception will be generated.

Here the code should be:
/ *
 * here's the WFI
 */
asm("WFI\n"
:
:
: "memory", "cc");

Seems the kernel doesn't support ARM BE-8 well. I don't know why this 
problem happens.
Can anyone tell me who owns this part? I can prepare a patch then. 
Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2] perf tools: Remove warnings on JIT samples for srcline sort key

2012-10-14 Thread Namhyung Kim

From: Namhyung Kim 

When using the srcline sort key with perf report, I see many lines of
warning related to JIT samples like below:

  addr2line: '/tmp/perf-1397.map': No such file

Since it's not a ELF binary and doesn't provide such information, just
use the raw ip address.

Cc: David Ahern 
Cc: Irina Tirdea 
Signed-off-by: Namhyung Kim 
---
Maybe you want fold this into patch #1 and I'm okay with it. :)

 tools/perf/util/sort.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index dd68f115d392..cfd1c0feb32d 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -263,6 +263,9 @@ static int hist_entry__srcline_snprintf(struct hist_entry 
*self, char *bf,
if (!self->ms.map)
goto out_ip;
 
+   if (!strncmp(self->ms.map->dso->long_name, "/tmp/perf-", 10))
+   goto out_ip;
+
snprintf(cmd, sizeof(cmd), "addr2line -e %s %016" PRIx64,
 self->ms.map->dso->long_name, self->ip);
fp = popen(cmd, "r");
-- 
1.7.11.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] perf tools: Fix segfault when using srcline sort key

2012-10-14 Thread Namhyung Kim

From: Namhyung Kim 

The srcline sort key is for grouping samples based on their source
file and line number.  It use addr2line tool to get the information
but it requires dso name.  It caused a segfault when a sample does not
have the name by dereferencing a NULL pointer.  Fix it by using raw ip
addresses for those samples.

Signed-off-by: Namhyung Kim 
---
 tools/perf/util/sort.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index b5b1b9211960..dd68f115d392 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -260,6 +260,9 @@ static int hist_entry__srcline_snprintf(struct hist_entry 
*self, char *bf,
if (path != NULL)
goto out_path;
 
+   if (!self->ms.map)
+   goto out_ip;
+
snprintf(cmd, sizeof(cmd), "addr2line -e %s %016" PRIx64,
 self->ms.map->dso->long_name, self->ip);
fp = popen(cmd, "r");
-- 
1.7.11.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V2 2/3] dmaengine: dw_dmac: Enhance device tree support

2012-10-14 Thread Viresh Kumar

On 12 October 2012 20:28, Andy Shevchenko
 wrote:
>> + if (last_dw) {
>> + if ((last_bus_id == param) && (last_dw == dw))
>> + return false;
>> + }
> Just came to my mind.
> dw can't be NULL, can't it?
> Then
> if (last_dw) {
> ...
> }
> is unneeded.

Fixup for this:

diff --git a/drivers/dma/dw_dmac.c b/drivers/dma/dw_dmac.c
index d72c26f..764c159 100644
--- a/drivers/dma/dw_dmac.c
+++ b/drivers/dma/dw_dmac.c
@@ -1196,11 +1196,8 @@ bool dw_dma_generic_filter(struct dma_chan
*chan, void *param)
 * failure. If dw and param are same, i.e. trying on same dw with
 * different channel, return false.
 */
-   if (last_dw) {
-   if ((last_bus_id == param) && (last_dw == dw))
-   return false;
-   }
-
+   if ((last_dw == dw) && (last_bus_id == param))
+   return false;
/*
 * Return true:
 * - If dw_dma's platform data is not filled with slave info, then all
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V2 3/3] ARM: SPEAr13xx: Pass DW DMAC platform data from DT

2012-10-14 Thread Viresh Kumar

On 14 October 2012 15:31, Jean-Christophe PLAGNIOL-VILLARD
 wrote:
> On 19:40 Sat 13 Oct , Viresh Kumar wrote:
>> On 13 October 2012 19:38, Viresh Kumar  wrote:
>> > On 13 October 2012 17:52, Jean-Christophe PLAGNIOL-VILLARD
>> >  wrote:
>> >> On 14:18 Sat 13 Oct , Viresh Kumar wrote:
>> >>>On Oct 13, 2012 12:16 PM, "Jean-Christophe PLAGNIOL-VILLARD"
>> >>> wrote:
>> >>>>
>> >>>> On 22:42 Fri 12 Oct , Viresh Kumar wrote:
>> >>>> > On 12 October 2012 21:51, Jean-Christophe PLAGNIOL-VILLARD
>> >>>> >  wrote:
>> >>>> > >> >> + OF_DEV_AUXDATA("arasan,cf-spear1340", MCIF_CF_BASE, 
>> >>> NULL,
>> >>>"cf"),
>> >>>> > >> > ?/
>> >>>> > >>
>> >>>> > >> Sorry. can't get it :(
>> >>>> > > what is the "cf" as paltfrom data
>> >>>> >
>> >>>> > This is dma bus_id string, which matches with what is passed from 
>> >>> dtb.
>> >>>> so pass if via dtb too
>> >>>
>> >>>Yes. Already passed in 13xx.dtsi.
>>
>> Probably some confusion here. What i meant to say here is, dmac's
>> DT slave info has a node for cf and cf driver expects this string to come
>> via platform data currently.
>
> so use a phandle to connect them

The purpose of this patchset wasn't to fix how CF driver sends its filter
routine and its parameter. CF driver would be fixed later by ST guys.

--
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH REGRESSION FIX] dw_dmac: make driver's endianness configurable

2012-10-14 Thread viresh kumar

On Sun, Oct 14, 2012 at 1:24 PM, Hein Tibosch  wrote:
> From: Hein Tibosch 
>
> The dw_dmac was originally developed for avr32 to be used with the Synopsys
> DesignWare AHB DMA controller. Starting from 2.6.38, access to the device's 
> i/o
> memory was done with the little-endian readl/writel functions(1)
>
> This broke the driver for the avr32 platform, because it needs big (native)
> endian accessors.
> This patch makes the endianness configurable using 'DW_DMAC_BIG_ENDIAN_IO',
> which will default be true for AVR32
>
> I submitted this patch before(2) but then waited for Andy to finish other
> changes to the same module(3).
>
> (1) https://patchwork.kernel.org/patch/608211
> (2) https://lkml.org/lkml/2012/8/26/148
> (3) https://lkml.org/lkml/2012/9/21/173
>
> Signed-off-by: Hein Tibosch 

Good.

Acked-by: Viresh Kumar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Fix memory leak in cpufreq stats.

2012-10-14 Thread Tu, Xiaobing


Fix memory leak in cpufreq stats.

When system enter sleep, non-boot CPUs will be disable.
Cpufreq stats sysfs is created when the CPU is up, but it is not freed when
the CPU going down. This will cause memory leak.
signed-off-by: xiaobing tu 
signed-off-by: guifang tang 

diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c
index b40ee14..3998316 100644
--- a/drivers/cpufreq/cpufreq_stats.c
+++ b/drivers/cpufreq/cpufreq_stats.c
@@ -328,6 +328,7 @@ static int __cpuinit cpufreq_stat_cpu_callback(struct 
notifier_block *nfb,
cpufreq_update_policy(cpu);
break;
case CPU_DOWN_PREPARE:
+   case CPU_DOWN_PREPARE_FROZEN:
cpufreq_stats_free_sysfs(cpu);
break;
case CPU_DEAD:

Br
XiaoBing Tu
PSI@System Integration Shanghai


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-next: Tree for Oct 15

2012-10-14 Thread Stephen Rothwell

Hi all,

The merge window has closed, feel free to add new stuff again.

Changes since 201201012:

The l2-mtd tree still had its build failure so I used the version from
next-20121011.

The tip tree gained a conflict against Linus' tree.

The kvm-ppc tree gained a build failure so I used the version from
next-20121012.

The akpm tree gained a conflict against Linus' tree and lost a couple of
patches that turned up elsewhere.



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" as mentioned in the FAQ on the wiki
(see below).

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log files
in the Next directory.  Between each merge, the tree was built with
a ppc64_defconfig for powerpc and an allmodconfig for x86_64. After the
final fixups (if any), it is also built with powerpc allnoconfig (32 and
64 bit), ppc44x_defconfig and allyesconfig (minus
CONFIG_PROFILE_ALL_BRANCHES - this fails its final link) and i386, sparc,
sparc64 and arm defconfig. These builds also have
CONFIG_ENABLE_WARN_DEPRECATED, CONFIG_ENABLE_MUST_CHECK and
CONFIG_DEBUG_INFO disabled when necessary.

Below is a summary of the state of the merge.

We are up to 204 trees (counting Linus' and 26 trees of patches pending
for Linus' tree), more are welcome (even if they are currently empty).
Thanks to those who have contributed, and to those who haven't, please do.

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

There is a wiki covering stuff to do with linux-next at
http://linux.f-seidel.de/linux-next/pmwiki/ .  Thanks to Frank Seidel.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

$ git checkout master
$ git reset --hard stable
Merging origin/master (ddffeb8 Linux 3.7-rc1)
Merging fixes/master (12250d8 Merge branch 'i2c-embedded/for-next' of 
git://git.pengutronix.de/git/wsa/linux)
Merging kbuild-current/rc-fixes (b1e0d8b kbuild: Fix gcc -x syntax)
Merging arm-current/fixes (3d6ee36 Merge branch 'late-for-linus' of 
git://git.linaro.org/people/rmk/linux-arm)
Merging m68k-current/for-linus (f82735d m68k: Use PTR_RET rather than 
if(IS_ERR(...)) + PTR_ERR)
Merging powerpc-merge/merge (fd3bc66 Merge tag 'disintegrate-powerpc-20121009' 
into merge)
Merging sparc/master (4d7127d Merge branch 'for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security)
Merging net/master (4d7127d Merge branch 'for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security)
Merging sound-current/for-linus (5d037f9 Merge tag 'asoc-3.6' of 
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus)
Merging pci-current/for-linus (0ff9514 PCI: Don't print anything while decoding 
is disabled)
Merging wireless/master (c3e7724 mac80211: use ieee80211_free_txskb to fix 
possible skb leaks)
Merging driver-core.current/driver-core-linus (5698bd7 Linux 3.6-rc6)
Merging tty.current/tty-linus (b70936d tty: serial: sccnxp: Fix bug with 
unterminated platform_id list)
Merging usb.current/usb-linus (d40ce17 Merge tag 'disintegrate-usb-20121009' of 
git://git.infradead.org/users/dhowells/linux-headers into usb-linus)
Merging staging.current/staging-linus (5698bd7 Linux 3.6-rc6)
Merging char-misc.current/char-misc-linus (fea7a08 Linux 3.6-rc3)
Merging input-current/for-linus (0cc8d6a Merge branch 'next' into for-linus)
Merging md-current/for-linus (80b4812 md/raid10: fix "enough" function for 
detecting if array is failed.)
Merging audit-current/for-linus (c158a35 audit: no leading space in 
audit_log_d_path prefix)
Merging crypto-current/master (c9f97a2 crypto: x86/glue_helper - fix storing of 
new IV in CBC encryption)
Merging ide/master (9974e43 ide: fix generic_ide_suspend/resume Oops)
Merging dwmw2/master (244dc4e Merge 
git://git.infradead.org/users/dwmw2/random-2.6)
Merging sh-current/sh-fixes-for-linus (4403310 SH: Convert out[bwl] macros to 
inline functions)
Merging irqdomain-current/irqdomain/merge (15e06bf irqdomain: Fix debugfs 
formatting)
Merging devicetree-current/devicetree/merge (4e8383b of: release node fix for 
of_parse_phandle_with_args)
Merging spi-current/spi/merge (d1c185b of/spi: Fix SPI module loading by using 
proper "spi:" modalias prefixes.)
Merging gpio-current/gpio/merge (96b7064 gpio/tca6424: merge I2C transactions, 
remove cast)
Merging asm-generic/

[PATCH] perf ui/browser: Fix off-by-two bug on the first column

2012-10-14 Thread Namhyung Kim

From: Namhyung Kim 

The commit 5395a04841fc ("perf hists: Separate overhead and baseline
columns") makes the "Overhead" column no more the first one.  So it
resulted in the mis-aligned column in the normal (non-diff) output.

Reported-by: Markus Trippelsdorf 
Cc: Jiri Olsa 
Signed-off-by: Namhyung Kim 
---
 tools/perf/ui/browsers/hists.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 0568536ecf67..fe4430aed635 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -610,6 +610,7 @@ static int hist_browser__show_entry(struct hist_browser 
*browser,
char folded_sign = ' ';
bool current_entry = ui_browser__is_current_entry(&browser->b, row);
off_t row_offset = entry->row_offset;
+   bool first = true;
 
if (current_entry) {
browser->he_selection = entry;
@@ -633,10 +634,11 @@ static int hist_browser__show_entry(struct hist_browser 
*browser,
if (!perf_hpp__format[i].cond)
continue;
 
-   if (i) {
+   if (!first) {
slsmg_printf("  ");
width -= 2;
}
+   first = false;
 
if (perf_hpp__format[i].color) {
hpp.ptr = &percent;
-- 
1.7.11.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Bug fix] nfs-client: fix nfs_inode_attrs_need_update for async read_done comes during truncating to smaller size

2012-10-14 Thread Chen Gang

Hello Trond Myklebust, Jeff Layton:

1) Root Cause:
   A) begin truncate to smaller, after async read finish starting.
   B) async read done come, after truncate operation change inode size.
   C) in nfs_inode_attrs_need_update, nfs_size_need_update return true.
  i)   the bigger size is the original old size of client itself.
  ii)  the smaller size is the current true size.
  iii) nfs_inode_attrs_need_update not consider this situation.

2) Fix nfs_size_need_update:
   A) delete it:
  i)   it is for performance, not necessary (not for correctness).
  ii)  if it was necessary, it should use "!=" instead of '>'.
  iii) it is the simplest way to fix this bug (maybe not best way).
   B) consider this situation in it:
  i)   it is the best way.
  ii)  it is a little complex (need think of)
  iii) sorry for I do not know how to fix it (at least now).
   C) not touch it:
  i)   correct another place (such as nfs_update_inode)
  ii)  it is a bad idea (at least, I think it is)
  iii) we need keep the source code as clearer as possible.

3) Test Result:
   A) it is one client and one server separately, under 3.6-rc5 x86_32.
   B) use one process (fsx-linux) test (only one user mode thread).
   C) only use read, truncate, llseek, fstat operation for one file.

   Before delete nfs_size_need_update, it causes issue.
   After delete nfs_size_need_update, it is ok.


User Mode Log:
-
<<>>
tag=nfsx-linux stime=1350202875
cmdline="export VERSION SOCKET_TYPE; TCbin=$LTPROOT/testcases/bin fsx.sh"
contacts=""
analysis=exit
<<>>

Test Options:
 VERSION: 2
 RHOST: dhcp122.asianux.net
 ITERATIONS: 5
 SOCKET_TYPE: udp
 NFS_TYPE: nfs
Setting up remote machine: dhcp122.asianux.net
Mounting NFS filesystem dhcp122.asianux.net:/tmp/fsx1447.testdir on
/opt/ltp/testcases/bin/fsx1447 with options '-o proto=udp,vers=2 '
fsx-linux -N 5 /opt/ltp/testcases/bin/fsx1447/testfile Starting
truncating to largest ever: 0x13e76
truncating to largest ever: 0x2e52c
truncating to largest ever: 0x3c2c2
truncating to largest ever: 0x3f15f
truncating to largest ever: 0x3fcb9
truncating to largest ever: 0x3fe96
truncating to largest ever: 0x3ff9d
Size error: expected 0x36ef9 stat 0x3bbca seek 0x36ef9
LOG DUMP (5652 total operations):

...

5636: 1350203089.781599 READ 0x143b6 thru 0x21ccb (0xd916 bytes)
5637: 1350203090.028214 READ 0x2a629 thru 0x2d0a1 (0x2a79 bytes)
5638: 1350203090.072029 TRUNCATE DOWN   from 0x2d0a2 to 0x1bb35
5639: 1350203090.087401 READ 0x11a05 thru 0x1bb34 (0xa130 bytes)
5640: 1350203090.223985 READ 0x508c thru 0xa9da (0x594f bytes)
5641: 1350203090.245717 TRUNCATE DOWN   from 0x1bb35 to 0x8830
5642: 1350203090.353502 READ 0x548f thru 0x882f (0x33a1 bytes)
5643: 1350203090.366596 READ 0x5802 thru 0x882f (0x302e bytes)
5644: 1350203090.366629 TRUNCATE UP from 0x8830 to 0x20011
5645: 1350203090.379476 TRUNCATE DOWN   from 0x20011 to 0x134f4
5646: 1350203090.396234 READ 0x124a0 thru 0x134f3 (0x1054 bytes)
5647: 1350203090.401805 READ 0x880b thru 0x1189d (0x9093 bytes)
5648: 1350203090.532050 READ 0x134c7 thru 0x134f3 (0x2d bytes)
5649: 1350203090.532057 TRUNCATE UP from 0x134f4 to 0x3bbca
5650: 1350203090.546373 READ 0x2944c thru 0x2c1d6 (0x2d8b bytes)
5651: 1350203090.561228 READ 0xdbe1 thru 0x16260 (0x8680 bytes)
5652: 1350203090.751937 TRUNCATE DOWN   from 0x3bbca to 0x36ef9
Correct content saved for comparison
(maybe hexdump "/opt/ltp/testcases/bin/fsx1447/testfile" vs
"/opt/ltp/testcases/bin/fsx1447/testfile.fsxgood")
fsx-linux -N 5 /opt/ltp/testcases/bin/fsx1447/testfile Finished
Cleaning up testcase
Unmounting /opt/ltp/testcases/bin/fsx1447
Test Failed: Errors have resulted from this test
incrementing stop
<<>>
initiation_status="ok"
duration=218 termination_type=exited termination_id=1 corefile=no
cutime=43 cstime=82
<<>>


-


Kernel Mode Log: (using printk which I add)
-
Time:  My Mark:   Task ptr:  comments (include function name):
[  280.883701] gchen_tag: f5c3, nfs_read_done call
nfs_refresh_inode, cur=0x3bbca, new=0x3bbca
[  280.890677] gchen_tag: f5c3, nfs_read_done call
nfs_refresh_inode, cur=0x3bbca, new=0x3bbca
[  280.897437] gchen_tag: f5c3, nfs_read_done call
nfs_refresh_inode, cur=0x3bbca, new=0x3bbca
[  280.897441] gchen_tag: f5e48c90, nfs_setattr_update_inode, cur=3bbca,
new=36ef9
[  280.897450] gchen_tag: f5e48c90, nfs_setattr
[  280.897462] gchen_tag: hit, f5c3, nfs_refresh_inode_locked,
cur=36ef9, new=3bbca
[  280.897469] gchen_tag: f5c3, nfs_update_inode, change size,
cur=36ef9, new=3bbca
[  280.898129] gchen_tag: f5e48c90, nfs_update_inode, change size,
cur=3bbca, new=36ef9
[  280.977915] gchen_tag: f5c3, nfs_update_inode, not change size,
cur=36ef9, n

linux-next: manual merge of the akpm tree with Linus' tree

2012-10-14 Thread Stephen Rothwell

Hi Andrew,

Today's linux-next merge of the akpm tree got a conflict in
arch/arm64/include/asm/unistd32.h between commit f3d447a97f24 ("arm64: Do
not include asm/unistd32.h in asm/unistd.h") from Linus' tree and commit
"compat: generic compat_sys_sched_rr_get_interval implementation" from
the akpm tree.

I fixed it up (I think - see below) and can carry the fix as necessary
(no action is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
index 63f853f..0f13ca8 100644
--- a/arch/arm64/include/asm/unistd.h
+++ b/arch/arm64/include/asm/unistd.h
@@ -25,5 +25,6 @@
 #define __ARCH_WANT_SYS_SIGPROCMASK
 #define __ARCH_WANT_COMPAT_SYS_RT_SIGSUSPEND
 #define __ARCH_WANT_COMPAT_SYS_SENDFILE
+#define __ARCH_WANT_COMPAT_SYS_SCHED_RR_GET_INTERVAL
 #endif
 #include 


pgp9B7N27kjoE.pgp
Description: PGP signature

[PATCH] docbook: networking: fix file paths for uapi headers

2012-10-14 Thread Randy Dunlap

From: Randy Dunlap 

Update file paths in Documentation/DocBook/networking.tmpl for uapi headers.

Signed-off-by: Randy Dunlap 
---
 Documentation/DocBook/networking.tmpl |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- lnx-37-rc1.orig/Documentation/DocBook/networking.tmpl
+++ lnx-37-rc1/Documentation/DocBook/networking.tmpl
@@ -56,7 +56,7 @@
 !Enet/core/filter.c
  
  Generic Network Statistics
-!Iinclude/linux/gen_stats.h
+!Iinclude/uapi/linux/gen_stats.h
 !Enet/core/gen_stats.c
 !Enet/core/gen_estimator.c
  
@@ -80,7 +80,7 @@
 !Enet/wimax/op-rfkill.c
 !Enet/wimax/stack.c
 !Iinclude/net/wimax.h
-!Iinclude/linux/wimax.h
+!Iinclude/uapi/linux/wimax.h
  
   
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 3.7-rc1 (uml uapi errors)

2012-10-14 Thread Randy Dunlap

On 10/14/2012 06:15 PM, Randy Dunlap wrote:

> On 10/14/2012 03:27 PM, Linus Torvalds wrote:
> 
>> The two weeks are up, and I was merging during my trip, so no reason
>> for merge window extensions.
>>
>> The 3.7-rc1 kernel is out there. There's a few big things worth noting here:
>>
>>  - the "uapi" include file cleanups. The idea is that the stuff
>> exported to user space should now be found under include/uapi and
>> arch/$(ARCH)/include/uapi.
>>
>>Let's hope it actually works. Because otherwise this was just a
>> totally pointless pain in the *ss. And regardless, I'm definitely done
>> with these kinds of "let's do massive cleanup of the include files"
>> forever.
> 
> 
> 
> Building um (uml) for x86_64 (defconfig) has lots of errors like:
> 
> In file included from include/linux/irq.h:22:0,
>  from include/asm-generic/hardirq.h:12,
>  from arch/um/include/generated/asm/hardirq.h:1,
>  from include/linux/hardirq.h:7,
>  from include/linux/ftrace_event.h:7,
>  from include/trace/syscall.h:6,
>  from include/linux/syscalls.h:78,
>  from init/noinitramfs.c:23:
> include/linux/irqnr.h:4:30: fatal error: uapi/linux/irqnr.h: No such file or 
> directory
> 
> make[2]: *** [init/main.o] Error 1
> make[2]: *** [init/noinitramfs.o] Error 1
> make[2]: *** [init/do_mounts.o] Error 1
> make[2]: *** [arch/um/kernel/irq.o] Error 1



Similar build errors on i386 (X86_32) and x86_64.
Maybe they are due to using O=subdir when building?

-- 
~Randy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][CFT][CFReview] execve and kernel_thread unification work

2012-10-14 Thread Al Viro

On Mon, Oct 01, 2012 at 10:38:09PM +0100, Al Viro wrote:
> [with apologies for folks Cc'd, resent due to mis-autoexpanded l-k address
> on the original posting ;-/  Mea culpa...]
> 
>   There's an interesting ongoing project around kernel_thread() and
> friends, including execve() variants.  I really need help from architecture
> maintainers on that one; I'd been able to handle (and test) quite a few
> architectures on my own [alpha, arm, m68k, powerpc, s390, sparc, x86, um]
> plus two more untested [frv, mn10300].  c6x patches had been supplied by
> Mark Salter; everything else remains to be done.  Right now it's at
> minus 1.2KLoC, quite a bit of that removed from asm glue and other black 
> magic.

Update:
* all infrastructure is in mainline now, along with conversion for
kernel_thread() callbacks to the form that allows really simple model for
kernel_execve() _without_ flagday changes.
* #experimental-kernel_thread is gone; this stuff is in for-next
now.
* a lot of architecture conversions had been done and some are
even tested.  Currently missing are only 7 - avr32, hexagon, m32r, openrisc,
score, tile and xtensa.  OTOH, a lot are completely untested.  I've put
per-architecture stuff into separate branches and I promise never rebase
those once arch maintainers will be OK with the stuff in them.  IOW, they'll
be safe to pull into respective architecture trees.

Folks, *please* review the stuff in signal.git#arch-*.  All of them are
completely independent.  I'll be glad to get ACKs/fixes/replacements/etc.
I've merged some of those into for-next, but that can change at any time -
it's not final; for-next will be rebased.  Obviously, I hope to get to
the situation when all of those branches (plus currently missing ones)
get into shape that satisfies architecture maintainers.  Once that happens,
all those branches will be merged into for-next.

I think the model is about final wrt kernel_thread()/kernel_execve()/
sys_execve().  There's one possible change on top of it, but it's reasonably
well-isolated from the rest.  As it is, the model to aim for is this:
* select GENERIC_KERNEL_THREAD and GENERIC_KERNEL_EXECVE
* kill local kernel_thread()/kernel_execve() implementations
* generic kernel_thread() will call your copy_thread() with
NULL regs and fn/arg passed in the pair of arguments that are blindly
passed all the way through to copy_thread() - usp and stack_size resp.
In such case copy_thread() should arrange for the newborn to be woken
up in a function that is very similar to ret_from_fork().  The only
difference is that between the call of schedule_tail() and jumping into
the "return from syscall" code it should call fn(arg), using the data
left for it by copy_thread().
* unlike the previous variant, ret_from_kernel_execve() is not
needed at all; no need to play longjmp()-like games when kernel_thread()
callbacks had been taught to return normally all the way out when
kernel_execve() returns 0; any updates of sp/manipulations of register
windows/etc. will happen without any magic.
* provide current_pt_regs() if needed.  Default is
task_pt_regs(current), but you might want to optimize it and unlike
task_pt_regs() it must work whenever we are in syscall or in a kernel thread.
task_pt_regs(task), OTOH, is required to work only when task can be
interrogated by tracer.
* no more syscalls-from-kernel, which often allows for simplifications
in the syscall entry/exit logics.  I haven't done any of those; up to the
architecture maintainers.

One thing to keep in mind is that right now on SMP architectures
there's the third caller of copy_thread(), besides fork()/clone()/vfork()
(all pass userland pt_regs, with the address being current_pt_regs()) and
kernel_thread() (pass NULL pt_regs, kthread creation time).  It's fork_idle()
and it passes zero-filled pt_regs.  Frankly, I'm not even sure we want to
call copy_thread() in that case - the stuff set up by it goes nowhere.
We do that for each possible secondary CPU on SMP and we do *not* expose
those threads to scheduler.  When CPU gets initialized we have the
secondary bootstrap take that task_struct as current.  Its kernel stack,
thread_info, etc. are set up by said secondary bootstrap, overriding whatever
copy_thread() has done.  Eventually the bootstrap reaches cpu_idle(),
which is where we schedule away.  switch_to() done by schedule() is what
completes setting the things up; at that point they are ready to be woken
up - and not in ret_from_fork(), of course.
For the majority of architectures nothing done by copy_thread() in
that case is used afterwards, so we might as well stop calling it when
copy_process() is called by fork_idle().  I know of only one dubious case -
powerpc sets thread->ksp_limit on copy_thread() and I'm not sure if
that's get overwritten in secondary bootstrap - the value would be still
correct and I don't see any obvious places where it would be re

Re: [PATCH REGRESSION FIX] dw_dmac: make driver's endianness configurable

2012-10-14 Thread Hein Tibosch

Hi Andy,
On 10/15/2012 4:08 AM, Andy Shevchenko wrote:
> On Sun, Oct 14, 2012 at 10:54 AM, Hein Tibosch  wrote:
>> From: Hein Tibosch 
>>
>> The dw_dmac was originally developed for avr32 to be used with the Synopsys
>> DesignWare AHB DMA controller. Starting from 2.6.38, access to the device's 
>> i/o
>> 
>>  #define dma_readl(dw, name) \
>> -   readl(&(__dw_regs(dw)->name))
>> +   dma_readl_native(&(__dw_regs(dw)->name))
>>  #define dma_writel(dw, name, val) \
>> -   writel((val), &(__dw_regs(dw)->name))
>> +   dma_writel_native((val), &(__dw_regs(dw)->name))
>>
>>  #define channel_set_bit(dw, reg, mask) \
>> dma_writel(dw, reg, ((mask) << 8) | (mask))
> Why did you not change this one?

Because "dma_writel" already calls "dma_writel_native"

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -mmotm] kernel: fix synchro-test.c printk format warrnings

2012-10-14 Thread Randy Dunlap

From: Randy Dunlap 

Fix printk format warnings in kernel/synchro-test.c.
Fixes warnings on i386 and on x86_64.

kernel/synchro-test.c:432:4: warning: format '%u' expects type 'unsigned int', 
but argument 5 has type 'long unsigned int'
kernel/synchro-test.c:437:4: warning: format '%u' expects type 'unsigned int', 
but argument 5 has type 'long unsigned int'
kernel/synchro-test.c:442:4: warning: format '%u' expects type 'unsigned int', 
but argument 5 has type 'long unsigned int'
kernel/synchro-test.c:447:4: warning: format '%u' expects type 'unsigned int', 
but argument 5 has type 'long unsigned int'
kernel/synchro-test.c:452:4: warning: format '%u' expects type 'unsigned int', 
but argument 5 has type 'long unsigned int'

Signed-off-by: Randy Dunlap 
---
 kernel/synchro-test.c |   10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

--- mmotm-2012-1011-1614.orig/kernel/synchro-test.c
+++ mmotm-2012-1011-1614/kernel/synchro-test.c
@@ -429,27 +429,27 @@ static int __init do_tests(void)
for (loop = 0; loop < MAX_THREADS; loop++) {
if (loop < nummx) {
init_completion(&mx_comp[loop]);
-   kthread_run(mutexer, (void *) loop, "Mutex%u", loop);
+   kthread_run(mutexer, (void *) loop, "Mutex%lu", loop);
}
 
if (loop < numsm) {
init_completion(&sm_comp[loop]);
-   kthread_run(semaphorer, (void *) loop, "Sem%u", loop);
+   kthread_run(semaphorer, (void *) loop, "Sem%lu", loop);
}
 
if (loop < numrd) {
init_completion(&rd_comp[loop]);
-   kthread_run(reader, (void *) loop, "Read%u", loop);
+   kthread_run(reader, (void *) loop, "Read%lu", loop);
}
 
if (loop < numwr) {
init_completion(&wr_comp[loop]);
-   kthread_run(writer, (void *) loop, "Write%u", loop);
+   kthread_run(writer, (void *) loop, "Write%lu", loop);
}
 
if (loop < numdg) {
init_completion(&dg_comp[loop]);
-   kthread_run(downgrader, (void *) loop, "Down%u", loop);
+   kthread_run(downgrader, (void *) loop, "Down%lu", loop);
}
}
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 3.7-rc1 (uml uapi errors)

2012-10-14 Thread Randy Dunlap

On 10/14/2012 03:27 PM, Linus Torvalds wrote:

> The two weeks are up, and I was merging during my trip, so no reason
> for merge window extensions.
> 
> The 3.7-rc1 kernel is out there. There's a few big things worth noting here:
> 
>  - the "uapi" include file cleanups. The idea is that the stuff
> exported to user space should now be found under include/uapi and
> arch/$(ARCH)/include/uapi.
> 
>Let's hope it actually works. Because otherwise this was just a
> totally pointless pain in the *ss. And regardless, I'm definitely done
> with these kinds of "let's do massive cleanup of the include files"
> forever.



Building um (uml) for x86_64 (defconfig) has lots of errors like:

In file included from include/linux/irq.h:22:0,
 from include/asm-generic/hardirq.h:12,
 from arch/um/include/generated/asm/hardirq.h:1,
 from include/linux/hardirq.h:7,
 from include/linux/ftrace_event.h:7,
 from include/trace/syscall.h:6,
 from include/linux/syscalls.h:78,
 from init/noinitramfs.c:23:
include/linux/irqnr.h:4:30: fatal error: uapi/linux/irqnr.h: No such file or 
directory

make[2]: *** [init/main.o] Error 1
make[2]: *** [init/noinitramfs.o] Error 1
make[2]: *** [init/do_mounts.o] Error 1
make[2]: *** [arch/um/kernel/irq.o] Error 1






-- 
~Randy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ 097/120] rcu: Fix day-one dyntick-idle stall-warning bug

2012-10-14 Thread Paul E. McKenney

On Mon, Oct 15, 2012 at 12:54:13AM +0100, Ben Hutchings wrote:
> On Sun, 2012-10-14 at 16:32 -0700, Paul E. McKenney wrote:
> > On Fri, Oct 12, 2012 at 11:14:28PM +0100, Ben Hutchings wrote:
> > > On Thu, 2012-10-11 at 10:00 +0900, Greg Kroah-Hartman wrote:
> > > > 3.4-stable review patch.  If anyone has any objections, please let me 
> > > > know.
> > > > 
> > > > --
> > > > 
> > > > From: "Paul E. McKenney" 
> > > > 
> > > > commit a10d206ef1a83121ab7430cb196e0376a7145b22 upstream.
> > > [...]
> > > > This commit therefore makes CPUs check more carefully before starting a
> > > > new grace period.  This new check relies on an array of tail pointers
> > > > into each CPU's list of callbacks.  If the CPU is up to date on which
> > > > grace periods have completed, it checks to see if any callbacks follow
> > > > the RCU_DONE_TAIL segment, otherwise it checks to see if any callbacks
> > > > follow the RCU_WAIT_TAIL segment.  The reason that this works is that
> > > > the RCU_WAIT_TAIL segment will be promoted to the RCU_DONE_TAIL segment
> > > > as soon as the CPU is officially notified that the old grace period
> > > > has ended.
> > > [...]
> > > > --- a/kernel/rcutree.c
> > > > +++ b/kernel/rcutree.c
> > > > @@ -295,7 +295,9 @@ cpu_has_callbacks_ready_to_invoke(struct
> > > >  static int
> > > >  cpu_needs_another_gp(struct rcu_state *rsp, struct rcu_data *rdp)
> > > >  {
> > > > -   return *rdp->nxttail[RCU_DONE_TAIL] && !rcu_gp_in_progress(rsp);
> > > > +   return *rdp->nxttail[RCU_DONE_TAIL +
> > > > +ACCESS_ONCE(rsp->completed) != 
> > > > rdp->completed] &&
> > > 
> > > This is a very obscurely written expression.  The array index is parsed
> > > as:
> > >   (RCU_DONE_TAIL + ACCESS_ONCE(rsp->completed)) != rdp->completed
> > > 
> > > Since RCU_DONE_TAIL == 0 and RCU_WAIT_TAIL == 1, this is then equivalent
> > > to:
> > >   ACCESS_ONCE(rsp->completed) != rdp->completed
> > > or:
> > >   (ACCESS_ONCE(rsp->completed) != rdp->completed) ? RCU_WAIT_TAIL : 
> > > RCU_DONE_TAIL
> > > 
> > > But whyever didn't you write that explicitly?
> > 
> > Because the way I think of it is the way that I wrote it -- you should
> > look at the value of the first pointer unless this CPU isn't up to date
> > with the latest grace period, in which case you need to go one step
> > farther up the array of tail pointers.
> 
> That is not the way you wrote it, since + has higher precedence than !=.

Color me slow and stupid!!!  Indeed, it is working by accident.  I clearly
need to either add the parentheses or use one of the other forms...

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] regmap : make lock/unlock functions customizable.

2012-10-14 Thread Mark Brown

On Fri, Oct 12, 2012 at 10:24:12AM +0200, Davide Ciminaghi wrote:
> On Fri, Oct 12, 2012 at 03:26:09PM +0900, Mark Brown wrote:
> > On Mon, Oct 01, 2012 at 11:31:04PM +0200, cimina...@gnudd.com wrote:

> > > + struct regmap *map = (struct regmap *)__map;
> > >   mutex_lock(&map->mutex);

> > ...you should never need to cast away from or to void, if you do there's
> > a bug somewhere.

> regmap lock/unlock original functions just received a struct regmap * .
> I needed a void * for the customized version of such functions, so just
> replaced struct regmap * with void *

You're not getting the point.  The issue is the casts, not the prototype
of the function.  If those casts when referencing the pointer do
anything at all it's masking a bug, you should never need to cast a
pointer to or from void.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V2 4/5] regmap: add API to get irq_domain from regmap irq

2012-10-14 Thread Mark Brown

On Wed, Oct 10, 2012 at 10:49:04PM +0530, Laxman Dewangan wrote:
> Add API regmap_irq_get_irq_domain() for getting the
> irq domain from regmap irq. The irq domain created on
> result of regmap_add_irq_chip() from driver.

Sorry, when doing the merge down after the merge window I realised that
I already have a patch there adding a function regmap_irq_get_domain()
which has been sitting there for a while now, I think it was just fat
fingers that meant it didn't make v3.7.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] power: remove duplicated include of abx500.h from ab8500_fb.c

2012-10-14 Thread Thiago Farina

Signed-off-by: Thiago Farina 
---
 drivers/power/ab8500_fg.c | 17 -
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/drivers/power/ab8500_fg.c b/drivers/power/ab8500_fg.c
index 2db8cc2..1041397 100644
--- a/drivers/power/ab8500_fg.c
+++ b/drivers/power/ab8500_fg.c
@@ -15,22 +15,21 @@
  * Arun R Murthy 
  */
 
-#include 
-#include 
+#include 
+#include 
 #include 
+#include 
 #include 
-#include 
-#include 
 #include 
-#include 
 #include 
-#include 
 #include 
-#include 
 #include 
-#include 
+#include 
+#include 
+#include 
+#include 
+#include 
 #include 
-#include 
 
 #define MILLI_TO_MICRO 1000
 #define FG_LSB_IN_MA   1627
-- 
1.8.0.rc2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-next: build failure after merge of the kvm-ppc tree

2012-10-14 Thread Stephen Rothwell

Hi Alexander,

Take into account that I removed the top two WIP commits from the kvm-ppc
tree before the following ...

After merging the kvm-ppc tree, today's linux-next build (powerpc
ppc64_defconfig) failed like this:

ERROR: ".kvm_irqfd" [arch/powerpc/kvm/kvm.ko] undefined!
ERROR: ".kvm_irqfd_release" [arch/powerpc/kvm/kvm.ko] undefined!
ERROR: ".kvm_eventfd_init" [arch/powerpc/kvm/kvm.ko] undefined!
ERROR: ".kvm_ioeventfd" [arch/powerpc/kvm/kvm.ko] undefined!

I have used the kvm-ppc tree from next-20121012 for today.
-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpkD6b8NIez9.pgp
Description: PGP signature

linux-next: manual merge of the tip tree with Linus' tree

2012-10-14 Thread Stephen Rothwell

Hi all,

Today's linux-next merge of the tip tree got a conflict in
include/linux/mempolicy.h between commit 607ca46e97a1 ("UAPI: (Scripted)
Disintegrate include/linux") from Linus' tree and commits 6f98f92971e9
("mm/mpol: Make MPOL_LOCAL a real policy"), 84e3a981648d ("mm/mpol: Add
MPOL_MF_LAZY ..."), 0719b9688bfe ("mm/mpol: Add MPOL_MF_NOOP"),
4d58c795f691 ("mm/mpol: Check for misplaced page") and fa74ef9e42df
("sched/numa: Implement per task memory placement for 'big' processes")
from the tip tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

I also added this merge fix patch:

From: Stephen Rothwell 
Date: Mon, 15 Oct 2012 11:14:21 +1100
Subject: [PATCH] mm/pol: fixups for UAPI include files split

Signed-off-by: Stephen Rothwell 
---
 include/uapi/linux/mempolicy.h |   18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
index 23e62e0..0c774c6 100644
--- a/include/uapi/linux/mempolicy.h
+++ b/include/uapi/linux/mempolicy.h
@@ -20,6 +20,8 @@ enum {
MPOL_PREFERRED,
MPOL_BIND,
MPOL_INTERLEAVE,
+   MPOL_LOCAL,
+   MPOL_NOOP,  /* retain existing policy for range */
MPOL_MAX,   /* always last member of enum */
 };
 
@@ -47,9 +49,16 @@ enum mpol_rebind_step {
 
 /* Flags for mbind */
 #define MPOL_MF_STRICT (1<<0)  /* Verify existing pages in the mapping */
-#define MPOL_MF_MOVE   (1<<1)  /* Move pages owned by this process to conform 
to mapping */
-#define MPOL_MF_MOVE_ALL (1<<2)/* Move every page to conform to 
mapping */
-#define MPOL_MF_INTERNAL (1<<3)/* Internal flags start here */
+#define MPOL_MF_MOVE(1<<1) /* Move pages owned by this process to conform
+  to policy */
+#define MPOL_MF_MOVE_ALL (1<<2)/* Move every page to conform to policy 
*/
+#define MPOL_MF_LAZY(1<<3) /* Modifies '_MOVE:  lazy migrate on fault */
+#define MPOL_MF_INTERNAL (1<<4)/* Internal flags start here */
+
+#define MPOL_MF_VALID  (MPOL_MF_STRICT   | \
+MPOL_MF_MOVE | \
+MPOL_MF_MOVE_ALL | \
+MPOL_MF_LAZY)
 
 /*
  * Internal flags that share the struct mempolicy flags word with
@@ -59,6 +68,7 @@ enum mpol_rebind_step {
 #define MPOL_F_SHARED  (1 << 0)/* identify shared policies */
 #define MPOL_F_LOCAL   (1 << 1)/* preferred local allocation */
 #define MPOL_F_REBINDING (1 << 2)  /* identify policies in rebinding */
-
+#define MPOL_F_MOF (1 << 3) /* this policy wants migrate on fault */
+#define MPOL_F_HOME(1 << 4) /* this is the home-node policy */
 
 #endif /* _UAPI_LINUX_MEMPOLICY_H */
-- 
1.7.10.280.gaa39

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc include/linux/mempolicy.h
index e5ccb9d,67c9734..000
--- a/include/linux/mempolicy.h
+++ b/include/linux/mempolicy.h
@@@ -2,10 -7,72 +2,9 @@@
   * NUMA memory policies for Linux.
   * Copyright 2003,2004 Andi Kleen SuSE Labs
   */
 -
 -/*
 - * Both the MPOL_* mempolicy mode and the MPOL_F_* optional mode flags are
 - * passed by the user to either set_mempolicy() or mbind() in an 'int' actual.
 - * The MPOL_MODE_FLAGS macro determines the legal set of optional mode flags.
 - */
 -
 -/* Policies */
 -enum {
 -  MPOL_DEFAULT,
 -  MPOL_PREFERRED,
 -  MPOL_BIND,
 -  MPOL_INTERLEAVE,
 -  MPOL_LOCAL,
 -  MPOL_NOOP,  /* retain existing policy for range */
 -  MPOL_MAX,   /* always last member of enum */
 -};
 -
 -enum mpol_rebind_step {
 -  MPOL_REBIND_ONCE,   /* do rebind work at once(not by two step) */
 -  MPOL_REBIND_STEP1,  /* first step(set all the newly nodes) */
 -  MPOL_REBIND_STEP2,  /* second step(clean all the disallowed nodes)*/
 -  MPOL_REBIND_NSTEP,
 -};
 -
 -/* Flags for set_mempolicy */
 -#define MPOL_F_STATIC_NODES   (1 << 15)
 -#define MPOL_F_RELATIVE_NODES (1 << 14)
 -
 -/*
 - * MPOL_MODE_FLAGS is the union of all possible optional mode flags passed to
 - * either set_mempolicy() or mbind().
 - */
 -#define MPOL_MODE_FLAGS   (MPOL_F_STATIC_NODES | MPOL_F_RELATIVE_NODES)
 -
 -/* Flags for get_mempolicy */
 -#define MPOL_F_NODE   (1<<0)  /* return next IL mode instead of node mask */
 -#define MPOL_F_ADDR   (1<<1)  /* look up vma using address */
 -#define MPOL_F_MEMS_ALLOWED (1<<2) /* return allowed memories */
 -
 -/* Flags for mbind */
 -#define MPOL_MF_STRICT(1<<0)  /* Verify existing pages in the mapping 
*/
 -#define MPOL_MF_MOVE   (1<<1) /* Move pages owned by this process to conform
 - to policy */
 -#define MPOL_MF_MOVE_ALL (1<<2)   /* Move every page to conform to policy 
*/
 -#define MPOL_MF_LAZY   (1<<3) /* Modifies '_MOVE:  lazy migrate on fault */
 -#define MPOL_MF_INTERNAL (1<<4)   /* Internal flags start here */

Re: [RFC v3 00/13] vfs: hot data tracking

2012-10-14 Thread Zheng Liu

On Wed, Oct 10, 2012 at 06:07:22PM +0800, zwu.ker...@gmail.com wrote:
> From: Zhi Yong Wu 
> 
> NOTE:
> 
>   The patchset is currently post out mainly to make sure
> it is going in the correct direction and hope to get some
> helpful comments from other guys.
>   For more infomation, please check hot_tracking.txt in Documentation

Hi Zhi Yong,

If I want to use this patch set in ext4, could I apply this patch set
directly or I need to call some functions like in btrfs.  Thanks.

Regards,
Zheng
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC v3 13/13] vfs: add documentation

2012-10-14 Thread Zheng Liu

Hi Zhi Yong,

[cut...]
> +3. The Design
> +
> +These include the following parts:
> +
> +* Hooks in existing vfs functions to track data access frequency
> +
> +* New rbtrees for tracking access frequency of inodes and sub-file
 ^^^ s/rbtrees/radix-trees
> +ranges (hot_rb.c)
    Now it seems that all codes are in the same file.

Regards,
Zheng
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ 097/120] rcu: Fix day-one dyntick-idle stall-warning bug

2012-10-14 Thread Ben Hutchings

On Sun, 2012-10-14 at 16:32 -0700, Paul E. McKenney wrote:
> On Fri, Oct 12, 2012 at 11:14:28PM +0100, Ben Hutchings wrote:
> > On Thu, 2012-10-11 at 10:00 +0900, Greg Kroah-Hartman wrote:
> > > 3.4-stable review patch.  If anyone has any objections, please let me 
> > > know.
> > > 
> > > --
> > > 
> > > From: "Paul E. McKenney" 
> > > 
> > > commit a10d206ef1a83121ab7430cb196e0376a7145b22 upstream.
> > [...]
> > > This commit therefore makes CPUs check more carefully before starting a
> > > new grace period.  This new check relies on an array of tail pointers
> > > into each CPU's list of callbacks.  If the CPU is up to date on which
> > > grace periods have completed, it checks to see if any callbacks follow
> > > the RCU_DONE_TAIL segment, otherwise it checks to see if any callbacks
> > > follow the RCU_WAIT_TAIL segment.  The reason that this works is that
> > > the RCU_WAIT_TAIL segment will be promoted to the RCU_DONE_TAIL segment
> > > as soon as the CPU is officially notified that the old grace period
> > > has ended.
> > [...]
> > > --- a/kernel/rcutree.c
> > > +++ b/kernel/rcutree.c
> > > @@ -295,7 +295,9 @@ cpu_has_callbacks_ready_to_invoke(struct
> > >  static int
> > >  cpu_needs_another_gp(struct rcu_state *rsp, struct rcu_data *rdp)
> > >  {
> > > - return *rdp->nxttail[RCU_DONE_TAIL] && !rcu_gp_in_progress(rsp);
> > > + return *rdp->nxttail[RCU_DONE_TAIL +
> > > +  ACCESS_ONCE(rsp->completed) != rdp->completed] &&
> > 
> > This is a very obscurely written expression.  The array index is parsed
> > as:
> > (RCU_DONE_TAIL + ACCESS_ONCE(rsp->completed)) != rdp->completed
> > 
> > Since RCU_DONE_TAIL == 0 and RCU_WAIT_TAIL == 1, this is then equivalent
> > to:
> > ACCESS_ONCE(rsp->completed) != rdp->completed
> > or:
> > (ACCESS_ONCE(rsp->completed) != rdp->completed) ? RCU_WAIT_TAIL : 
> > RCU_DONE_TAIL
> > 
> > But whyever didn't you write that explicitly?
> 
> Because the way I think of it is the way that I wrote it -- you should
> look at the value of the first pointer unless this CPU isn't up to date
> with the latest grace period, in which case you need to go one step
> farther up the array of tail pointers.

That is not the way you wrote it, since + has higher precedence than !=.

Ben.

-- 
Ben Hutchings
The world is coming to an end.  Please log off.


signature.asc
Description: This is a digitally signed message part

Re: [ 097/120] rcu: Fix day-one dyntick-idle stall-warning bug

2012-10-14 Thread Paul E. McKenney

On Fri, Oct 12, 2012 at 11:14:28PM +0100, Ben Hutchings wrote:
> On Thu, 2012-10-11 at 10:00 +0900, Greg Kroah-Hartman wrote:
> > 3.4-stable review patch.  If anyone has any objections, please let me know.
> > 
> > --
> > 
> > From: "Paul E. McKenney" 
> > 
> > commit a10d206ef1a83121ab7430cb196e0376a7145b22 upstream.
> [...]
> > This commit therefore makes CPUs check more carefully before starting a
> > new grace period.  This new check relies on an array of tail pointers
> > into each CPU's list of callbacks.  If the CPU is up to date on which
> > grace periods have completed, it checks to see if any callbacks follow
> > the RCU_DONE_TAIL segment, otherwise it checks to see if any callbacks
> > follow the RCU_WAIT_TAIL segment.  The reason that this works is that
> > the RCU_WAIT_TAIL segment will be promoted to the RCU_DONE_TAIL segment
> > as soon as the CPU is officially notified that the old grace period
> > has ended.
> [...]
> > --- a/kernel/rcutree.c
> > +++ b/kernel/rcutree.c
> > @@ -295,7 +295,9 @@ cpu_has_callbacks_ready_to_invoke(struct
> >  static int
> >  cpu_needs_another_gp(struct rcu_state *rsp, struct rcu_data *rdp)
> >  {
> > -   return *rdp->nxttail[RCU_DONE_TAIL] && !rcu_gp_in_progress(rsp);
> > +   return *rdp->nxttail[RCU_DONE_TAIL +
> > +ACCESS_ONCE(rsp->completed) != rdp->completed] &&
> 
> This is a very obscurely written expression.  The array index is parsed
> as:
>   (RCU_DONE_TAIL + ACCESS_ONCE(rsp->completed)) != rdp->completed
> 
> Since RCU_DONE_TAIL == 0 and RCU_WAIT_TAIL == 1, this is then equivalent
> to:
>   ACCESS_ONCE(rsp->completed) != rdp->completed
> or:
>   (ACCESS_ONCE(rsp->completed) != rdp->completed) ? RCU_WAIT_TAIL : 
> RCU_DONE_TAIL
> 
> But whyever didn't you write that explicitly?

Because the way I think of it is the way that I wrote it -- you should
look at the value of the first pointer unless this CPU isn't up to date
with the latest grace period, in which case you need to go one step
farther up the array of tail pointers.

Thanx, Paul

> Ben.
> 
> > +  !rcu_gp_in_progress(rsp);
> >  }
> >  
> >  /*
> 
> -- 
> Ben Hutchings
> Kids!  Bringing about Armageddon can be dangerous.  Do not attempt it in
> your own home. - Terry Pratchett and Neil Gaiman, `Good Omens'


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [git pull] signals pile 3

2012-10-14 Thread Russell King - ARM Linux

On Mon, Oct 15, 2012 at 12:39:40AM +0200, Daniel Mack wrote:
> Tested-by: Daniel Mack 
> 
> Many thanks for the very prompt response!

Thanks Daniel.

I've also tested this on my OMAP4430 board running in ARM mode, so that
still works - we've covered the possibilities between us here between
ARM mode and Thumb mode, so...

Linus, could you merge this patch please, thanks.

8<===
From: Russell King 
Subject: [PATCH] ARM: fix oops on initial entry to userspace with Thumb2 kernels

Daniel Mack reports an oops at boot with the latest kernels:

[4.896717] Internal error: Oops - undefined instruction: 0 [#1] SMP THUMB2
[4.904034] Modules linked in:
[4.907253] CPU: 0Not tainted  (3.6.0-11057-g584df1d #145)
[4.913372] PC is at cpsw_probe+0x45a/0x9ac
[4.917760] LR is at trace_hardirqs_on_caller+0x8f/0xfc
[4.923235] pc : []lr : []psr: 6113
[4.923235] sp : cf055fb0  ip :   fp : 
[4.935246] r10:   r9 :   r8 : 
[4.940715] r7 :   r6 :   r5 : c0344555  r4 : 
[4.947548] r3 : cf057a40  r2 :   r1 : 0001  r0 : 
[4.954383] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM Segment user
[4.961853] Control: 50c5387d  Table: 8f3f4019  DAC: 0015
[4.967868] Process init (pid: 1, stack limit = 0xcf054240)
[4.973702] Stack: (0xcf055fb0 to 0xcf056000)
[4.978269] 5fa0: 0001  

[4.986836] 5fc0: cf055fb0 c000d1a8     

[4.995403] 5fe0:  be9b3f10  b6f6add0 0010  
bfaf a8babbaa

The analysis of this is as follows.  In init/main.c, we issue:

kernel_thread(kernel_init, NULL, CLONE_FS | CLONE_SIGHAND);

This creates a new thread, which falls through to the ret_from_fork
assembly, with r4 set NULL and r5 set to kernel_init.  You can see
this in your oops dump register set - r5 is 0xc0344555, which is the
address of kernel_init plus 1 which marks the function as Thumb code.

Now, let's look at this code a little closer - this is what the
disassembly looks like:

c000d180 :
c000d180:   f03a fe08   bl  c0047d94 
c000d184:   2d00cmp r5, #0
c000d186:   bf1eitttne
c000d188:   4620movne   r0, r4
c000d18a:   46femovne   lr, pc <-- XXX
c000d18c:   46afmovne   pc, r5
c000d18e:   46e9mov r9, sp
c000d190:   ea4f 3959   mov.w   r9, r9, lsr #13
c000d194:   ea4f 3949   mov.w   r9, r9, lsl #13
c000d198:   e7c8b.n c000d12c 
c000d19a:   bf00nop
c000d19c:   f3af 8000   nop.w

This code was introduced in 9fff2fa0db911 (arm: switch to saner
kernel_execve() semantics).  I have marked one instruction, and it's
the significant one - I'll come back to that later.

Eventually, having had a successful call to kernel_execve(), kernel_init()
returns zero.

In returning, it uses the value in 'lr' which was set by the instruction
I marked above.  Unfortunately, this causes lr to contain 0xc000d18e -
an even address.  This switches the ISA to ARM on return but with a non
word aligned PC value.

So, what do we end up executing?  Well, not the instructions above - yes
the opcodes, but they don't mean the same thing in ARM mode.  In ARM mode,
it looks like this instead:

c000d18c:   46e946afstrbtmi r4, [r9], pc, lsr #13
c000d190:   3959ea4fldmdbcc r9, {r0, r1, r2, r3, r6, r9, fp, sp, 
lr, pc}^
c000d194:   3949ea4fstmdbcc r9, {r0, r1, r2, r3, r6, r9, fp, sp, 
lr, pc}^
c000d198:   bf00e7c8svclt   0xe7c8
c000d19c:   8000f3afandhi   pc, r0, pc, lsr #7
c000d1a0:   e88db092stm sp, {r1, r4, r7, ip, sp, pc}
c000d1a4:   46e81fff;  instruction: 
0x46e81fff
c000d1a8:   8a00f3efbhi 0xc004a16c
c000d1ac:   0a0cf08abeq 0xc03493dc

I have included more above, because it's relevant.  The PSR flags which we
can see in the oops dump are nZCv, so Z and C are set.

All the above ARM instructions are not executed, except for two.  c000d1a0,
which has no writeback, and writes below the current stack pointer (and
that data is lost when we take the next exception.)  The other instruction
which is executed is c000d1ac, which takes us to... 0xc03493dc.  However,
remember that bit 1 of the PC got set.  So that makes the PC value
0xc03493de.

And that value is the value we find in the oops dump for PC.  What is the
instruction here when interpreted in ARM mode?

   0:   f71e150c;  instruction: 0xf71e150c

and there we have our undefined instruction (remember that the 'never'
condition code, 0xf, has been deprecated and is now always executed as it
is now being used for additional instructions.)

This path also nicely explains the state of the

[Announce] sg3_utils-1.34 available

2012-10-14 Thread Douglas Gilbert


sg3_utils is a package of command line utilities for sending
SCSI and some ATA commands to devices. This package targets
the linux kernel (lk) 3, 2.6 and lk 2.4 series. It also has
ports to FreeBSD, Tru64, Solaris, and Windows (cygwin and mingw).

This version adds sg_xcopy and sg_copy_results (contributed by
Hannes Reinecke). This version tracks various changes made by
www.t10.org since January 2012.

For an overview of sg3_utils and downloads see this page:
http://sg.danny.cz/sg/sg3_utils.html
The sg_ses utility (for enclosure devices) is discussed at:
http://sg.danny.cz/sg/sg_ses.html
The SG_IO ioctl is discussed at:
http://sg.danny.cz/sg/sg_io.html
A full changelog can be found at:
http://sg.danny.cz/sg/p/sg3_utils.ChangeLog

A release announcement will be sent to freecode.com .

Changelog for sg3_utils-1.34 [20121013] [svn: r461]
  - sg_xcopy: new dd like utility for extended copy command
  - sg_copy_results: new utility for receive copy results
  - sg_verify: add 16 byte cdb, bytchk (data-out buffer)
and group number support
  - sync to spc4r36 and sbc3r32
  - sg_inq: add --export so sg_inq can replace udev's scsi_id
- decode old EMC Symmetrix abuse of VPD page 0x83
  - sg_vpd: decode old EMC Symmetrix abuse of VPD page 0x83
  - sg_ses: increase max dpage response size to 64 KB
- allow ident,locate on enclosure controller
- more sanity for additional element status descriptor
  - sg_sanitize: add --ause, --fail and --test=
  - sg_luns: add long extended flat space addressing format
  - sg_logs: add ATA pass-through results lpage (SAT-2)
  - sg_rtpg: add --extended option
  - sg_senddiag: list rebuild assist diag page name
  - sg_pt_linux: expand DID_ (host_byte) codes
- cope with a transport error plus sense data
- prefer major() over MAJOR() macro
  - sg_lib: fix sg_get_command_name() service actions
- report sdat_ovfl bit (if set) in sense data
- decode extended_copy and receive_copy service actions
- decode read_buffer and write_buffer modes
- decode ATA PT fixed format sense (SAT-2)
  - sg_cmds_extra: add sg_ll_report_tgt_prt_grp2()
  - ./configure options:
- change --enable-no-linux-bsg to --disable-linuxbsg
- add --disable-scsistrings to reduce utility sizes

Changelog for sg3_utils-1.33 [20120118] [svn: r435]
...

Doug Gilbert
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RCU NOHZ, tsc, and clock_gettime

2012-10-14 Thread Paul E. McKenney

On Fri, Oct 12, 2012 at 02:27:01PM -0400, Prarit Bhargava wrote:
> 
> > The effect of removing the two functions you noted (on 3.6 and earlier)
> > is to prevent RCU from checking for dyntick-idle CPUs, likely incurring
> > a cache miss for each CPU with interrupts disabled.  If you have a lot
> > of CPUs (or even if NR_CPUS is large and you have a smaller number of
> > CPUs), this can result in user-space-visible delays.
> > 
> 
> Paul,
> 
> I built a kernel with NR_CPUS=48 and booted on a 48 cpu (logical) system.  I 
> do
> not see a difference in the test -- the variance is AFAICT just as large as 
> if I
> had run with NR_CPUS=4096.

OK -- have you applied John Stultz's suggestions?

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [git pull] signals pile 3

2012-10-14 Thread Daniel Mack

On 15.10.2012 00:24, Russell King - ARM Linux wrote:
> Okay, here's the post-mortem diagnosis.
> 
> What's happening is as follows (I'm very certain of this.)
> 
> We come through the usual init, and issue (see init/main.c):
> 
>   kernel_thread(kernel_init, NULL, CLONE_FS | CLONE_SIGHAND);
> 
> This creates a new thread, which falls through to the ret_from_fork
> assembly, with r4 set NULL and r5 set to kernel_init.  You can see
> this in your oops dump register set - r5 is 0xc0344555, which is the
> address of kernel_init plus 1 which marks the function as Thumb code.
> 
> Now, let's look at this code a little closer - this is what the
> disassembly looks like:
> 
> c000d180 :
> c000d180:   f03a fe08   bl  c0047d94 
> c000d184:   2d00cmp r5, #0
> c000d186:   bf1eitttne
> c000d188:   4620movne   r0, r4
> c000d18a:   46femovne   lr, pc <-- XXX
> c000d18c:   46afmovne   pc, r5
> c000d18e:   46e9mov r9, sp
> c000d190:   ea4f 3959   mov.w   r9, r9, lsr #13
> c000d194:   ea4f 3949   mov.w   r9, r9, lsl #13
> c000d198:   e7c8b.n c000d12c 
> c000d19a:   bf00nop
> c000d19c:   f3af 8000   nop.w
> 
> I have marked one instruction, and it's the significant one.
> 
> Eventually, having had a successful call to kernel_execve(), kernel_init()
> returns zero.
> 
> In returning, it uses the value in 'lr' which was set by the instruction
> I marked above.  Unfortunately, this causes lr to contain 0xc000d18e -
> an even address.  This switches the ISA to ARM on return but with a non
> word aligned PC value.
> 
> So, what do we end up executing?  Well, not the instructions above - yes
> the opcodes, but they don't mean the same thing in ARM mode.  In ARM mode,
> it looks like this instead:
> 
> c000d18c:   46e946afstrbtmi r4, [r9], pc, lsr #13
> c000d190:   3959ea4fldmdbcc r9, {r0, r1, r2, r3, r6, r9, fp, sp, 
> lr, pc}^
> c000d194:   3949ea4fstmdbcc r9, {r0, r1, r2, r3, r6, r9, fp, sp, 
> lr, pc}^
> c000d198:   bf00e7c8svclt   0xe7c8
> c000d19c:   8000f3afandhi   pc, r0, pc, lsr #7
> c000d1a0:   e88db092stm sp, {r1, r4, r7, ip, sp, pc}
> c000d1a4:   46e81fff;  instruction: 
> 0x46e81fff
> c000d1a8:   8a00f3efbhi 0xc004a16c
> c000d1ac:   0a0cf08abeq 0xc03493dc
> 
> I have included more above, because it's relevant.  The PSR flags which we
> can see in the oops dump are nZCv, so Z and C are set.
> 
> All the above ARM instructions are not executed, except for two.  c000d1a0,
> which has no writeback, and writes below the current stack pointer (and
> that data is lost when we take the next exception.)  The other instruction
> which is executed is c000d1ac, which takes us to... 0xc03493dc.  However,
> remember that bit 1 of the PC got set.  So that makes it 0xc03493de.
> 
> And that value is the value we find in the oops dump for PC.  What is the
> instruction here when interpreted in ARM mode?
> 
>0:   f71e150c;  instruction: 0xf71e150c
> 
> and there we have our undefined instruction (remember that the 'never'
> condition code, 0xf, has been deprecated and is now always executed.)
> 
> So, what we have above is a consistent and sane story for how we ended up
> at such a strange place in the kernel with such an odd register dump - with
> no unanswered questions about what happened to get us there.
> 
> In light of this, I'm 100% certain that the patch below will fix the issue
> you're seeing - please test this and get back to me ASAP, thanks.

Quite impressive analysis :) And it seems you really spotted the reason
here, as your patch fixes the problem.

>  arch/arm/kernel/entry-common.S |4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
> index 417bac1..3471175 100644
> --- a/arch/arm/kernel/entry-common.S
> +++ b/arch/arm/kernel/entry-common.S
> @@ -88,9 +88,9 @@ ENTRY(ret_from_fork)
>   bl  schedule_tail
>   cmp r5, #0
>   movne   r0, r4
> - movne   lr, pc
> + adrne   lr, BSYM(1f)
>   movne   pc, r5
> - get_thread_info tsk
> +1:   get_thread_info tsk
>   b   ret_slow_syscall
>  ENDPROC(ret_from_fork)

Tested-by: Daniel Mack 

Many thanks for the very prompt response!


Daniel


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Linux 3.7-rc1

2012-10-14 Thread Linus Torvalds

The two weeks are up, and I was merging during my trip, so no reason
for merge window extensions.

The 3.7-rc1 kernel is out there. There's a few big things worth noting here:

 - the "uapi" include file cleanups. The idea is that the stuff
exported to user space should now be found under include/uapi and
arch/$(ARCH)/include/uapi.

   Let's hope it actually works. Because otherwise this was just a
totally pointless pain in the *ss. And regardless, I'm definitely done
with these kinds of "let's do massive cleanup of the include files"
forever.

 - arm64 architecture inclusion. Let's see if it takes off..

   ... and let's see how many years we'll need before the arm people
do what every single other 64-bit arch has ever done: merge back with
the 32-bit code. As usual, people claimed that there were tons of
reasons why *this* time was different, and as usual it's almost
certainly going to be BS in the end, and a few years from now we'll
have big patches trying to merge it all back. But maybe it really
*was* different this time. Snicker.

 - arm multiplatform code.

   Finally. The ARM devicetree code stuff etc means that at least some
arm kernels can now be built to support multiple different platforms
in one single binary. I'm sure there's still tons to go, but it's a
big milestone nonetheless.

 - ARM virtualization and Xen support.

 - user namespaces are coming back in a workable form.

 - signed kernel modules

 - nice cleanups: workqueues (Tejun Heo) and generic
execve/kernel_thread (Al Viro)

There are tons of other updates, but those are the "big new features"
that came to mind. Maybe I missed some.

Of course, despite all the above changes, the bulk of the actual
patches are still the usual driver updates, which aren't even
mentioned above. So the "big changes" are actually in reality smaller
than the "normal changes we have all the time".

Anyway, the shortlog is much too big as usual for an -rc1 (with over
ten thousand commits), but appended is my "short mergelog" that gives
at least some high-level view of the merges I did.

   Linus

---
Mergelog since 3.6:
  - GFS2 updates from Steven Whitehouse
  - regmap updates from Mark Brown
  - regulator updates from Mark Brown
  - the trivial tree from Jiri Kosina
  - HID updates from Jiri Kosina
  - localmodconfig fixes from Steven Rostedt
  - ktest fix from Steven Rostedt
  - RCU changes from Ingo Molnar
  - core kernel fixes from Ingo Molnar
  - core locking changes from Ingo Molnar
  - trivial irq core update from Ingo Molnar
  - perf update from Ingo Molnar
  - perf fix from Ingo Molnar
  - scheduler changes from Ingo Molnar
  - timer changes from Ingo Molnar
  - x86/apic changes from Ingo Molnar
  - x86/asm changes from Ingo Molnar
  - x86/build changes from Ingo Molnar
  - x86/cleanups from Ingo Molnar
  - x86/cpu and x86/cpufeature from Ingo Molnar
  - x86 debug update from Ingo Molnar
  - x86/EFI changes from Ingo Molnar
  - x86/fpu update from Ingo Molnar
  - x86/MCE update from Ingo Molnar
  - x86/mm changes from Ingo Molnar
  - x86/platform changes from Ingo Molnar
  - x86/microcode changes from Ingo Molnar
  - s390 updates from Martin Schwidefsky
  - arm64 support from Catalin Marinas
  - PCI changes from Bjorn Helgaas
  - clk framework update from Michael Turquette
  - char/misc driver merge from Greg Kroah-Hartman
  - driver core merge from Greg Kroah-Hartman
  - staging tree update from Greg Kroah-Hartman
  - TTY changes from Greg Kroah-Hartman
  - USB changes from Greg Kroah-Hartman
  - hwmon updates from Guenter Roeck
  - dlm updates from David Teigland
  - m68k updates from Geert Uytterhoeven
  - ia64 update from Tony Luck
  - x86/smap support from Ingo Molnar
  - CIFS updates from Steve French
  - non-critical ARM soc bug fixes from Olof Johansson
  - ARM soc general cleanups from Olof Johansson
  - ARM soc MAINTAINERS updates from Olof Johansson
  - ARM soc-specific updates from Olof Johansson
  - ARM soc device tree updates from Olof Johansson
  - ARM soc cleanups, part 2 from Olof Johansson
  - ARM soc-specific updates, take 2 from Olof Johansson
  - ARM soc driver specific changes from Olof Johansson
  - ARM soc board specific updates from Olof Johansson
  - ARM soc device tree updates, take 2 from Olof Johansson
  - ARM soc documentation updates from Olof Johansson
  - ARM soc multiplatform enablement from Olof Johansson
  - workqueue changes from Tejun Heo
  - cgroup updates from Tejun Heo
  - cgroup hierarchy update from Tejun Heo
  - user namespace changes from Eric Biederman
  - sparc updates from David Miller
  - networking changes from David Miller
  - GPIO changes from Linus Walleij
  - pinctrl changes from Linus Walleij
  - input updates from Dmitry Torokhov
  - infiniband updates from Roland Dreier
  - spi updates from Mark Brown
  - libata changes from Jeff Garzik
  - power management updates from Rafael J Wysocki
  - first round of SCSI updates from James Bottomley
  - CMA and DMA-mapping updat

Re: [git pull] signals pile 3

2012-10-14 Thread Russell King - ARM Linux

Okay, here's the post-mortem diagnosis.

What's happening is as follows (I'm very certain of this.)

We come through the usual init, and issue (see init/main.c):

kernel_thread(kernel_init, NULL, CLONE_FS | CLONE_SIGHAND);

This creates a new thread, which falls through to the ret_from_fork
assembly, with r4 set NULL and r5 set to kernel_init.  You can see
this in your oops dump register set - r5 is 0xc0344555, which is the
address of kernel_init plus 1 which marks the function as Thumb code.

Now, let's look at this code a little closer - this is what the
disassembly looks like:

c000d180 :
c000d180:   f03a fe08   bl  c0047d94 
c000d184:   2d00cmp r5, #0
c000d186:   bf1eitttne
c000d188:   4620movne   r0, r4
c000d18a:   46femovne   lr, pc <-- XXX
c000d18c:   46afmovne   pc, r5
c000d18e:   46e9mov r9, sp
c000d190:   ea4f 3959   mov.w   r9, r9, lsr #13
c000d194:   ea4f 3949   mov.w   r9, r9, lsl #13
c000d198:   e7c8b.n c000d12c 
c000d19a:   bf00nop
c000d19c:   f3af 8000   nop.w

I have marked one instruction, and it's the significant one.

Eventually, having had a successful call to kernel_execve(), kernel_init()
returns zero.

In returning, it uses the value in 'lr' which was set by the instruction
I marked above.  Unfortunately, this causes lr to contain 0xc000d18e -
an even address.  This switches the ISA to ARM on return but with a non
word aligned PC value.

So, what do we end up executing?  Well, not the instructions above - yes
the opcodes, but they don't mean the same thing in ARM mode.  In ARM mode,
it looks like this instead:

c000d18c:   46e946afstrbtmi r4, [r9], pc, lsr #13
c000d190:   3959ea4fldmdbcc r9, {r0, r1, r2, r3, r6, r9, fp, sp, 
lr, pc}^
c000d194:   3949ea4fstmdbcc r9, {r0, r1, r2, r3, r6, r9, fp, sp, 
lr, pc}^
c000d198:   bf00e7c8svclt   0xe7c8
c000d19c:   8000f3afandhi   pc, r0, pc, lsr #7
c000d1a0:   e88db092stm sp, {r1, r4, r7, ip, sp, pc}
c000d1a4:   46e81fff;  instruction: 
0x46e81fff
c000d1a8:   8a00f3efbhi 0xc004a16c
c000d1ac:   0a0cf08abeq 0xc03493dc

I have included more above, because it's relevant.  The PSR flags which we
can see in the oops dump are nZCv, so Z and C are set.

All the above ARM instructions are not executed, except for two.  c000d1a0,
which has no writeback, and writes below the current stack pointer (and
that data is lost when we take the next exception.)  The other instruction
which is executed is c000d1ac, which takes us to... 0xc03493dc.  However,
remember that bit 1 of the PC got set.  So that makes it 0xc03493de.

And that value is the value we find in the oops dump for PC.  What is the
instruction here when interpreted in ARM mode?

   0:   f71e150c;  instruction: 0xf71e150c

and there we have our undefined instruction (remember that the 'never'
condition code, 0xf, has been deprecated and is now always executed.)

So, what we have above is a consistent and sane story for how we ended up
at such a strange place in the kernel with such an odd register dump - with
no unanswered questions about what happened to get us there.

In light of this, I'm 100% certain that the patch below will fix the issue
you're seeing - please test this and get back to me ASAP, thanks.

 arch/arm/kernel/entry-common.S |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
index 417bac1..3471175 100644
--- a/arch/arm/kernel/entry-common.S
+++ b/arch/arm/kernel/entry-common.S
@@ -88,9 +88,9 @@ ENTRY(ret_from_fork)
bl  schedule_tail
cmp r5, #0
movne   r0, r4
-   movne   lr, pc
+   adrne   lr, BSYM(1f)
movne   pc, r5
-   get_thread_info tsk
+1: get_thread_info tsk
b   ret_slow_syscall
 ENDPROC(ret_from_fork)
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PULL] modules

2012-10-14 Thread Alan Cox

> I realize that fips_enabled is only for crazy people, but it's exactly
> code like this that limits it to only crazy people. Is there some
> *reason* for this?

Presumably its so a typical server with reboot on panic will reboot so
the attacker can hide the attempt better ;-)

Alan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Local DoS through write heavy I/O on CFQ & Deadline

2012-10-14 Thread Dave Chinner

On Thu, Oct 11, 2012 at 01:23:32PM +0100, Alex Bligh wrote:
> We have noticed significant I/O scheduling issues on both the CFQ and the
> deadline scheduler where a non-root user can starve any other process of
> any I/O for minutes at a time. The problem is more serious using CFQ but is
> still an effective local DoS vector using Deadline.
> 
> A simple way to generate the problem is:
> 
>   dd if=/dev/zero of=- bs=1M count=5 | dd if=- of=myfile bs=1M count=5
> 
> (note use of 2 dd's is to avoid alleged optimisation of the writing dd
> from /dev/zero). zcat-ing a large file with stout redirected to a file
> produces a similar error. Using ionice to set idle priority makes no
> difference.
> 
> To instrument the problem we produced a python script which does a MySQL
> select and update every 10 seconds, and time the execution of the update.
> This is normally milliseconds, but under user generated load conditions, we
> can take this to indefinite (on CFQ) and over a minute (on deadline).
> Postgres is affected in a similar manner (i.e. it is not MySQL specific).
> Simultaneously we have captured the output of 'vmstat 1 2' and
> /proc/meminfo, with appropriate timestamps.

Well, mysql is stuck in fsync(), so of course it's going to have
problems with write latency:

[ 3840.268303] [] jbd2_log_wait_commit+0xb5/0x130
[ 3840.268308] [] ? add_wait_queue+0x60/0x60
[ 3840.268313] [] ext4_sync_file+0x208/0x2d0

And postgres gets stuck there too. So what you are seeing is likely
an ext4 problem, not an IO scheduler problem.

Suggestion: try the same test with XFS. If the problem still exists,
then it *might* be an ioscheduler problem. If it goes away, then
it's an ext4 problem.

Cheers,

Dave.

-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PULL] modules

2012-10-14 Thread Linus Torvalds

On Sun, Oct 14, 2012 at 1:11 PM, Linus Torvalds
 wrote:
>
> I've pulled and resolved the branch, and I'm going through it now, but
> I'd like this verified before I push out if it all looks fine..

Hmm. So this thing makes me wonder:

/* Not having a signature is only an error if we're strict. */
if (err < 0 && fips_enabled)
panic("Module verification failed with error %d in FIPS mode\n",
  err);

do we really want to panic (even in fips_enabled mode)?

Sounds like it will just kill the machine if we ever end up having an
unsigned module by mistake anywhere.

I realize that fips_enabled is only for crazy people, but it's exactly
code like this that limits it to only crazy people. Is there some
*reason* for this?

 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel graphics drm issue?

2012-10-14 Thread Bruno Prémont

On Sun, 14 October 2012 Mark Hounschell  wrote:
> I gave it a try. I don't think it liked my kernel cmdline. dmesg attached. 
> There is a lot more in there now that nomodeset is gone and the debug is 
> turned on.
> 
> # ls -al /lib/firmware/edid/lg42lb9df.edid
> -rw-r--r-- 1 root root 1024 Oct 14  2012 /lib/firmware/edid/lg42lb9df.edid
> 
> ## cat /proc/cmdline
> root=/dev/disk/by-id/ata-INTEL_SSDSC2CW060A3_CVCV205106EB060AGN-part4 
> noresume splash=silent quiet apm=off vga=normal drm.debug=0xe irqpoll 
> drm_kms_helper.edid_firmware=edid/lg42lb9df.edid
> 
> 
> from attached dmesg:
> 1.833032] drm_kms_helper: Unknown parameter `edid'

As your drm drivers seem to all be built-in (according to kernel timings)
you will have to build the EDID firmware into the kernel as well (see
CONFIG_EXTRA_FIRMWARE), otherwise it probably can't be loaded (unless Linus'
firmware loading patch is already in 3.6.2 and root filesystem/initrd is
ready at that time).

Did you set CONFIG_DRM_LOAD_EDID_FIRMWARE?
If not, that may be the reason for unknown parameter `edid' error.

But I saw I mis-remembered side of EDID blobs, they are just 128 bytes
per block, not 512 (seems I was thinking disk sector sizes),
thus you should just get 256 bytes output.

Just truncating the file to 256 bytes will do. You may also
change your .c file to have
  uint8_t firmware[] = {
 ...
  };
  ...
  fwrite(firmware, sizeof(firmware), 1, fd);
  ...

that way compiler gets numbers right :)

Kernel code rejects edid with unexpected size! Thus it would have
complained if it had tried to load it with a size of 1k.

Bruno
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [git pull] signals pile 3

2012-10-14 Thread Russell King - ARM Linux

On Sun, Oct 14, 2012 at 05:35:23PM +0200, Daniel Mack wrote:
> I rebased my ARM development branch and figured that your patch 9fff2fa
> ("arm: switch to saner kernel_execve() semantics") breaks the boot on my
> board right after init is invoked via NFS:

Ok, I'm not going to assign blame to Al's commits (I never reviewed his
stuff before they hit mainline - patches never posted to the ARM mailing
list, and the development actually happened within the merge window,
all things we tell people not to do...)  I _still_ haven't reviewed that
stuff yet.

But... nevertheless...

> [4.682072] VFS: Mounted root (nfs filesystem) on device 0:12.
> [4.690744] devtmpfs: mounted
> [4.694395] Freeing init memory: 172K
> [5.291417] Internal error: Oops - undefined instruction: 0 [#1] SMP
> THUMB2

Ok, so this tells us the kernel was built using Thumb2 ISA.

> [5.298734] Modules linked in:
> [5.301952] CPU: 0Not tainted  (3.6.0-11053-g56c8535 #128)
> [5.308071] PC is at cpsw_probe+0x422/0x9ac

PC is not word aligned, so it can't be running in the ARM ISA.

> [5.312459] LR is at trace_hardirqs_on_caller+0x8f/0xfc
> [5.317934] pc : []lr : []psr: 6113

Note that this reconfirms the above (well, it should do, it's the same
value.)

> [5.317934] sp : cf055fb0  ip :   fp : 
> [5.329944] r10:   r9 :   r8 : 
> [5.335413] r7 :   r6 :   r5 : c034458d  r4 : 
> [5.342244] r3 : cf057a40  r2 :   r1 : 0001  r0 : 
> [5.349078] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM
> Segment user

And this tells us that we're running in ARM mode, not Thumb mode.

> [5.356546] Control: 50c5387d  Table: 8f434019  DAC: 0015
> [5.362562] Process init (pid: 1, stack limit = 0xcf054240)
> [5.368395] Stack: (0xcf055fb0 to 0xcf056000)
> [5.372961] 5fa0: 0001
>   
> [5.381525] 5fc0: cf055fb0 c000d1a8   
>   
> [5.390091] 5fe0:  bee83f10  b6fdedd0 0010
>  bfaf a8babbaa

No stack backtrace (and it's silent about why that is).

The other strange thing here is that the stack dump above is showing that
the stack is completely empty - which shouldn't be the case if we're in a
driver probe function - driver probe functions are called via the driver
model layers...

> [5.398664] Code: 2206a010 718ef508 0184f8da f8b1f65d (3070f8d8)

And now we come to the Code: line, which makes no sense as an ARM ISA:

   0:   2206a010andcs   sl, r6, #16
   4:   718ef508orrvc   pc, lr, r8, lsl #10
   8:   0184f8daldrdeq  pc, [r4, sl]
   c:   f8b1f65d;  instruction: 0xf8b1f65d
  10:   3070f8d8ldrsbtccpc, [r0], #-136 ; 0xff78; 

But as Thumb, it looks more reasonable:

   0:   a010add r0, pc, #64 ; (adr r0, 44 )
   2:   2206movsr2, #6
   4:   f508 718e   add.w   r1, r8, #284; 0x11c
   8:   f8da 0184   ldr.w   r0, [sl, #388]  ; 0x184
   c:   f65d f8b1   bl  ffe5d172 
  10:   f8d8 3070   ldr.w   r3, [r8, #112]  ; 0x70

I don't have any further comments to make on this yet, as I've no idea
what state stuff is in, but the above oops dump to me suggests that
we've randomly jumped into some part of the kernel which just happens
to be cpsw_probe().

Please send me (in private mail) your vmlinux file and a corresponding
oops dump from that same kernel, and I'll dig and try and work out
what's going on...

This kind of investigation reminds me of those I did back in the 1990s
when stuff was rather unstable and ARM was a young architecture.  Now
all we need is for an ARM platform to dump its entire memory out the
ethernet port, bringing an university department network to a halt (I
did that once - back in the 1990s - sorry Tim!)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PULL] modules

2012-10-14 Thread Linus Torvalds

On Wed, Oct 10, 2012 at 2:57 AM, Rusty Russell  wrote:
>
> 
> module signing is the highlight, but it's an all-over David Howells frenzy...
>
> 

Hmm. What happened here? It *looks* from your pull request like you
had a tag, and you usually do, but there's no tag anywhere..

I've pulled and resolved the branch, and I'm going through it now, but
I'd like this verified before I push out if it all looks fine..

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH REGRESSION FIX] dw_dmac: make driver's endianness configurable

2012-10-14 Thread Andy Shevchenko

On Sun, Oct 14, 2012 at 10:54 AM, Hein Tibosch  wrote:
> From: Hein Tibosch 
>
> The dw_dmac was originally developed for avr32 to be used with the Synopsys
> DesignWare AHB DMA controller. Starting from 2.6.38, access to the device's 
> i/o
> memory was done with the little-endian readl/writel functions(1)
>
> This broke the driver for the avr32 platform, because it needs big (native)
> endian accessors.
> This patch makes the endianness configurable using 'DW_DMAC_BIG_ENDIAN_IO',
> which will default be true for AVR32
>
> I submitted this patch before(2) but then waited for Andy to finish other
> changes to the same module(3).
>
> (1) https://patchwork.kernel.org/patch/608211
> (2) https://lkml.org/lkml/2012/8/26/148
> (3) https://lkml.org/lkml/2012/9/21/173
>
> Signed-off-by: Hein Tibosch 
>
> ---
>  drivers/dma/Kconfig|   11 +++
>  drivers/dma/dw_dmac_regs.h |   18 +-
>  2 files changed, 24 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
> index 677cd6e..d4c1218 100644
> --- a/drivers/dma/Kconfig
> +++ b/drivers/dma/Kconfig
> @@ -90,6 +90,17 @@ config DW_DMAC
>   Support the Synopsys DesignWare AHB DMA controller.  This
>   can be integrated in chips such as the Atmel AT32ap7000.
>
> +config DW_DMAC_BIG_ENDIAN_IO
> +   bool "Use big endian I/O register access"
> +   default y if AVR32
> +   depends on DW_DMAC
> +   help
> + Say yes here to use big endian I/O access when reading and writing
> + to the DMA controller registers. This is needed on some platforms,
> + like the Atmel AVR32 architecture.
> +
> + If unsure, use the default setting.
> +
>  config AT_HDMAC
> tristate "Atmel AHB DMA support"
> depends on ARCH_AT91
> diff --git a/drivers/dma/dw_dmac_regs.h b/drivers/dma/dw_dmac_regs.h
> index ff39fa6..8896559 100644
> --- a/drivers/dma/dw_dmac_regs.h
> +++ b/drivers/dma/dw_dmac_regs.h
> @@ -98,9 +98,17 @@ struct dw_dma_regs {
> u32 DW_PARAMS;
>  };
>
> +#ifdef CONFIG_DW_DMAC_BIG_ENDIAN_IO
> +#define dma_readl_native ioread32be
> +#define dma_writel_native iowrite32be
> +#else
> +#define dma_readl_native readl
> +#define dma_writel_native writel
> +#endif
> +
>  /* To access the registers in early stage of probe */
>  #define dma_read_byaddr(addr, name) \
> -   readl((addr) + offsetof(struct dw_dma_regs, name))
> +   dma_readl_native((addr) + offsetof(struct dw_dma_regs, name))
>
>  /* Bitfields in DW_PARAMS */
>  #define DW_PARAMS_NR_CHAN  8   /* number of channels */
> @@ -216,9 +224,9 @@ __dwc_regs(struct dw_dma_chan *dwc)
>  }
>
>  #define channel_readl(dwc, name) \
> -   readl(&(__dwc_regs(dwc)->name))
> +   dma_readl_native(&(__dwc_regs(dwc)->name))
>  #define channel_writel(dwc, name, val) \
> -   writel((val), &(__dwc_regs(dwc)->name))
> +   dma_writel_native((val), &(__dwc_regs(dwc)->name))
>
>  static inline struct dw_dma_chan *to_dw_dma_chan(struct dma_chan *chan)
>  {
> @@ -246,9 +254,9 @@ static inline struct dw_dma_regs __iomem 
> *__dw_regs(struct dw_dma *dw)
>  }
>
>  #define dma_readl(dw, name) \
> -   readl(&(__dw_regs(dw)->name))
> +   dma_readl_native(&(__dw_regs(dw)->name))
>  #define dma_writel(dw, name, val) \
> -   writel((val), &(__dw_regs(dw)->name))
> +   dma_writel_native((val), &(__dw_regs(dw)->name))
>
>  #define channel_set_bit(dw, reg, mask) \
> dma_writel(dw, reg, ((mask) << 8) | (mask))
Why did you not change this one?


-- 
With Best Regards,
Andy Shevchenko
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PULL REQ] IXP4xx changes for Linux 3.7

2012-10-14 Thread Arnd Bergmann

On Saturday 13 October 2012, Krzysztof Halasa wrote:
> Linus,
> 
> please pull my ARM IXP4xx changes for 3.7:
> 
> The following changes since commit 4d7127dace8cf4b05eb7c8c8531fc204fbb195f4:
> 
> "Merge branch 'for-linus' of git://git.kernel.org/.../jmorris/linux-security"
> (2012-10-13 11:29:00 +0900)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/chris/linux.git next
> 
> for you to fetch changes up to b94740b3b38fd8e37fcd3bb06a18ec2796061c7d:
>   IXP4xx: use __iomem for MMIO (2012-10-13 20:37:30 +0200)
> 
> Build-tested for now. This is based on your current tree tip because it
> depends on commits following 3.6 release.

Hi Krzysztof,

as mentioned before, all arch/arm/mach-* patches should go through the
arm-soc tree or get an Ack from the arm-soc maintainers. The same thing
is true for the char-misc and the crypto trees.

Also, never rebase your tree immediately before sending a pull request.
The preferred way is to have everything based on the -rc release that
is the latest one at the time when you do your testing. If you rebase
later, you essentially have to test everything again.

Finally when sending bug fixes, please annotate any patches with
'Cc: sta...@vger.kernel.org' if they address bugs that are already
present in older kernels, so that the stable and longterm maintainers
can easily backport the fixes.

Almost all of the platform patches in your tree seem to be bug fixes,
so they are still good for inclusion in v3.7 if you submit them to
arm-soc soon, but please make sure you separate bug fixes from other
changes so we can group them appropriately when forwarding them to
Linus.

Thanks,

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [revert request for commit 9fff2fa] Re: [git pull] signals pile 3

2012-10-14 Thread Al Viro

On Sun, Oct 14, 2012 at 08:24:03PM +0100, Al Viro wrote:

> Russell, could you recall what those had been about?  I'm not sure if that
> had been oopsable that far back (again, oops scenario is userland stack
> page getting swapped out before we get to start_thread(), leading to
> direct read from an absent page in start_thread() by plain ldr, without
> anything in exception table about that insn), but it looks very odd
> regardless of that problem.

BTW, arm64 has copied that logics, so it also seems to be unsafe and very
odd - there we definitely have only ELF to cope with.  arm64 folks Cc'd...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] staging: lirc_serial: silence GCC warning

2012-10-14 Thread Paul Bolle

Building lirc_serial.o triggers this GCC warning:
drivers/staging/media/lirc/lirc_serial.c: In function '__check_sense':
drivers/staging/media/lirc/lirc_serial.c:1301:1: warning: return from 
incompatible pointer type [enabled by default]

This can be trivially fixed by changing the 'sense' parameter from bool
to int. But, to be safe, we also need to make sure 'sense' will only be
-1, 0, or 1. There's no need to document the new values that are now
allowed for the 'sense' parameter, since they're basically useless.

Signed-off-by: Paul Bolle 
---
0) This warning popped up when building v3.6.2 using Fedora 17's default
config (in which, for some reason, the LIRC drivers were enabled going
from v3.5.y to v3.6.y).

1) Compile tested only.

 drivers/staging/media/lirc/lirc_serial.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/media/lirc/lirc_serial.c 
b/drivers/staging/media/lirc/lirc_serial.c
index 97ef670..08cfaf6 100644
--- a/drivers/staging/media/lirc/lirc_serial.c
+++ b/drivers/staging/media/lirc/lirc_serial.c
@@ -1239,6 +1239,10 @@ static int __init lirc_serial_init_module(void)
}
}
 
+   /* make sure sense is either -1, 0, or 1 */
+   if (sense != -1)
+   sense = !!sense;
+
result = lirc_serial_init();
if (result)
return result;
@@ -1298,7 +1302,7 @@ MODULE_PARM_DESC(irq, "Interrupt (4 or 3)");
 module_param(share_irq, bool, S_IRUGO);
 MODULE_PARM_DESC(share_irq, "Share interrupts (0 = off, 1 = on)");
 
-module_param(sense, bool, S_IRUGO);
+module_param(sense, int, S_IRUGO);
 MODULE_PARM_DESC(sense, "Override autodetection of IR receiver circuit"
 " (0 = active high, 1 = active low )");
 
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] checkpatch: Improve network block comment style checking

2012-10-14 Thread Joe Perches

Some comment styles in net and drivers/net are flagged inappropriately.

Avoid proclaiming inline comments like:
int a = b;  /* some comment */
and block comments like:
/*
 * some comment
 /
are defective.

Tested with
$ cat drivers/net/t.c
/* foo */

/*
 * foo
 */

/* foo
 */

/* foo
 * bar */

/
 * some long block comment
 ***/

struct foo {
int bar;/* another test */
};
$

Reported-by: Larry Finger 
Signed-off-by: Joe Perches 
---
 scripts/checkpatch.pl |6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 21a9f5d..f18750e 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -1890,8 +1890,10 @@ sub process {
}
 
if ($realfile =~ m@^(drivers/net/|net/)@ &&
-   $rawline !~ m@^\+[ \t]*(\/\*|\*\/)@ &&
-   $rawline =~ m@^\+[ \t]*.+\*\/[ \t]*$@) {
+   $rawline !~ m@^\+[ \t]*\*/[ \t]*$@ &&   #trailing */
+   $rawline !~ m@^\+.*/\*.*\*/[ \t]*$@ &&  #inline /*...*/
+   $rawline !~ m@^\+.*\*{2,}/[ \t]*$@ &&   #trailing **/
+   $rawline =~ m@^\+[ \t]*.+\*\/[ \t]*$@) {#non blank */
WARN("NETWORKING_BLOCK_COMMENT_STYLE",
 "networking block comments put the trailing */ on 
a separate line\n" . $herecurr);
}


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/9] uprobes: check for single step support

2012-10-14 Thread Rabin Vincent

Check for single step support before calling user_enable_single_step(),
since user_enable_single_step() just BUG()s if support does not exist.
Needed by ARM.

Signed-off-by: Rabin Vincent 
---
 kernel/events/uprobes.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 98256bc..db4e3ab 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1450,7 +1450,8 @@ static struct uprobe *find_active_uprobe(unsigned long 
bp_vaddr, int *is_swbp)
 
 void __weak arch_uprobe_enable_step(struct arch_uprobe *arch)
 {
-   user_enable_single_step(current);
+   if (arch_has_single_step())
+   user_enable_single_step(current);
 }
 
 void __weak arch_uprobe_disable_step(struct arch_uprobe *arch)
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 6/9] uprobes: flush cache after xol write

2012-10-14 Thread Rabin Vincent

Flush the cache so that the instructions written to the XOL area are
visible.

Signed-off-by: Rabin Vincent 
---
 kernel/events/uprobes.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index ca000a9..8c52f93 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1246,6 +1246,7 @@ static unsigned long xol_get_insn_slot(struct uprobe 
*uprobe, unsigned long slot
offset = current->utask->xol_vaddr & ~PAGE_MASK;
vaddr = kmap_atomic(area->page);
arch_uprobe_xol_copy(&uprobe->arch, vaddr + offset);
+   flush_dcache_page(area->page);
kunmap_atomic(vaddr);
 
return current->utask->xol_vaddr;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/9] uprobes: allow arch access to xol slot

2012-10-14 Thread Rabin Vincent

Allow arches to customize how the instruction is filled into the xol
slot.  ARM will use this to insert an undefined instruction after the
real instruction in order to simulate a single step of the instruction
without hardware support.

Signed-off-by: Rabin Vincent 
---
 include/linux/uprobes.h |1 +
 kernel/events/uprobes.c |7 ++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
index da21b66..b4380ad 100644
--- a/include/linux/uprobes.h
+++ b/include/linux/uprobes.h
@@ -129,6 +129,7 @@ extern bool arch_uprobe_xol_was_trapped(struct task_struct 
*tsk);
 extern int  arch_uprobe_exception_notify(struct notifier_block *self, unsigned 
long val, void *data);
 extern void arch_uprobe_abort_xol(struct arch_uprobe *aup, struct pt_regs 
*regs);
 extern bool __weak arch_uprobe_ignore(struct arch_uprobe *aup, struct pt_regs 
*regs);
+extern void __weak arch_uprobe_xol_copy(struct arch_uprobe *auprobe, void 
*vaddr);
 #else /* !CONFIG_UPROBES */
 struct uprobes_state {
 };
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index a0e1a38..f7ff3a4 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1211,6 +1211,11 @@ static unsigned long xol_take_insn_slot(struct xol_area 
*area)
return slot_addr;
 }
 
+void __weak arch_uprobe_xol_copy(struct arch_uprobe *auprobe, void *vaddr)
+{
+   memcpy(vaddr, auprobe->insn, MAX_UINSN_BYTES);
+}
+
 /*
  * xol_get_insn_slot - If was not allocated a slot, then
  * allocate a slot.
@@ -1240,7 +1245,7 @@ static unsigned long xol_get_insn_slot(struct uprobe 
*uprobe, unsigned long slot
current->utask->vaddr = slot_addr;
offset = current->utask->xol_vaddr & ~PAGE_MASK;
vaddr = kmap_atomic(area->page);
-   memcpy(vaddr + offset, uprobe->arch.insn, MAX_UINSN_BYTES);
+   arch_uprobe_xol_copy(&uprobe->arch, vaddr + offset);
kunmap_atomic(vaddr);
 
return current->utask->xol_vaddr;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [STABLE REQUEST] add: e1000: fix lockdep splat in shutdown handler

2012-10-14 Thread Thomas Gleixner

On Thu, 11 Oct 2012, Steven Rostedt wrote:
> commit 3a3847e007aae732d64d8fd1374126393e9879a3
> Author: Jesse Brandeburg 
> Date:   Wed Jan 4 20:23:33 2012 +
> 
> e1000: fix lockdep splat in shutdown handler

as I discussed with Jesse on IRC, there is another possible deadlock
lurking in the e1000 code.

static void e1000_reinit_safe(struct e1000_adapter *adapter)
{
while (test_and_set_bit(__E1000_RESETTING, &adapter->flags))
msleep(1);
mutex_lock(&adapter->mutex);
e1000_down(adapter);

e1000_down() waits on the various work tasks to shut down, but those
work functions might be blocked on the adapter mutex.

I have no idea how I managed to trigger that one, but it's real. The
task dump I got out of the machine shows stuff waiting on each other
forever.

I can't give you a receipe to reprodruce. Looking at the code this is
not very surprising. It takes quite some coincidence of having
e1000_reinit_safe() being invoked and the delayed work timer bringing
the work on right after e1000_reinit_safe() took the adapter mutex.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 9/9] ARM: add uprobes support

2012-10-14 Thread Rabin Vincent

Add basic uprobes support for ARM.

perf probe --exec and SystemTap's userspace probing work.  The ARM
kprobes test code has also been run in a userspace harness to test the
uprobe instruction decoding.

Caveats:

 - Thumb is not supported
 - XOL abort/trap handling is not implemented

Signed-off-by: Rabin Vincent 
---
 arch/arm/Kconfig   |4 +
 arch/arm/include/asm/ptrace.h  |6 ++
 arch/arm/include/asm/thread_info.h |5 +-
 arch/arm/include/asm/uprobes.h |   34 +++
 arch/arm/kernel/Makefile   |1 +
 arch/arm/kernel/signal.c   |4 +
 arch/arm/kernel/uprobes.c  |  191 
 7 files changed, 244 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm/include/asm/uprobes.h
 create mode 100644 arch/arm/kernel/uprobes.c

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 272c3a1..2191b61d 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -168,6 +168,10 @@ config ZONE_DMA
 config NEED_DMA_MAP_STATE
def_bool y
 
+config ARCH_SUPPORTS_UPROBES
+   depends on KPROBES
+   def_bool y
+
 config ARCH_HAS_DMA_SET_COHERENT_MASK
bool
 
diff --git a/arch/arm/include/asm/ptrace.h b/arch/arm/include/asm/ptrace.h
index 142d6ae..297936a 100644
--- a/arch/arm/include/asm/ptrace.h
+++ b/arch/arm/include/asm/ptrace.h
@@ -197,6 +197,12 @@ static inline long regs_return_value(struct pt_regs *regs)
 
 #define instruction_pointer(regs)  (regs)->ARM_pc
 
+static inline void instruction_pointer_set(struct pt_regs *regs,
+  unsigned long val)
+{
+   instruction_pointer(regs) = val;
+}
+
 #ifdef CONFIG_SMP
 extern unsigned long profile_pc(struct pt_regs *regs);
 #else
diff --git a/arch/arm/include/asm/thread_info.h 
b/arch/arm/include/asm/thread_info.h
index 8477b4c..7bedaee 100644
--- a/arch/arm/include/asm/thread_info.h
+++ b/arch/arm/include/asm/thread_info.h
@@ -148,6 +148,7 @@ extern int vfp_restore_user_hwstate(struct user_vfp __user 
*,
 #define TIF_SIGPENDING 0
 #define TIF_NEED_RESCHED   1
 #define TIF_NOTIFY_RESUME  2   /* callback before returning to user */
+#define TIF_UPROBE 7
 #define TIF_SYSCALL_TRACE  8
 #define TIF_SYSCALL_AUDIT  9
 #define TIF_SYSCALL_TRACEPOINT 10
@@ -160,6 +161,7 @@ extern int vfp_restore_user_hwstate(struct user_vfp __user 
*,
 #define _TIF_SIGPENDING(1 << TIF_SIGPENDING)
 #define _TIF_NEED_RESCHED  (1 << TIF_NEED_RESCHED)
 #define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME)
+#define _TIF_UPROBE(1 << TIF_UPROBE)
 #define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE)
 #define _TIF_SYSCALL_AUDIT (1 << TIF_SYSCALL_AUDIT)
 #define _TIF_SYSCALL_TRACEPOINT(1 << TIF_SYSCALL_TRACEPOINT)
@@ -172,7 +174,8 @@ extern int vfp_restore_user_hwstate(struct user_vfp __user 
*,
 /*
  * Change these and you break ASM code in entry-common.S
  */
-#define _TIF_WORK_MASK (_TIF_NEED_RESCHED | _TIF_SIGPENDING | 
_TIF_NOTIFY_RESUME)
+#define _TIF_WORK_MASK (_TIF_NEED_RESCHED | _TIF_SIGPENDING | \
+_TIF_NOTIFY_RESUME | _TIF_UPROBE)
 
 #endif /* __KERNEL__ */
 #endif /* __ASM_ARM_THREAD_INFO_H */
diff --git a/arch/arm/include/asm/uprobes.h b/arch/arm/include/asm/uprobes.h
new file mode 100644
index 000..fa4b81e
--- /dev/null
+++ b/arch/arm/include/asm/uprobes.h
@@ -0,0 +1,34 @@
+#ifndef _ASM_UPROBES_H
+#define _ASM_UPROBES_H
+
+#include 
+
+typedef u32 uprobe_opcode_t;
+
+#define MAX_UINSN_BYTES4
+#define UPROBE_XOL_SLOT_BYTES  64
+
+#define UPROBE_SWBP_INSN   0x07f001f9
+#define UPROBE_SS_INSN 0x07f001fa
+#define UPROBE_SWBP_INSN_SIZE  4
+
+struct arch_uprobe_task {
+   u32 backup;
+};
+
+struct arch_uprobe {
+   u8 insn[MAX_UINSN_BYTES];
+   uprobe_opcode_t modinsn;
+   uprobe_opcode_t bpinsn;
+   bool simulate;
+   u32 pcreg;
+   void (*prehandler)(struct arch_uprobe *auprobe,
+  struct arch_uprobe_task *autask,
+  struct pt_regs *regs);
+   void (*posthandler)(struct arch_uprobe *auprobe,
+   struct arch_uprobe_task *autask,
+   struct pt_regs *regs);
+   struct arch_specific_insn asi;
+};
+
+#endif
diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
index 5bbec7b..a39f634 100644
--- a/arch/arm/kernel/Makefile
+++ b/arch/arm/kernel/Makefile
@@ -40,6 +40,7 @@ obj-$(CONFIG_DYNAMIC_FTRACE)  += ftrace.o insn.o
 obj-$(CONFIG_FUNCTION_GRAPH_TRACER)+= ftrace.o insn.o
 obj-$(CONFIG_JUMP_LABEL)   += jump_label.o insn.o patch.o
 obj-$(CONFIG_KEXEC)+= machine_kexec.o relocate_kernel.o
+obj-$(CONFIG_UPROBES)  += uprobes.o uprobes-arm.o kprobes-arm.o
 obj-$(CONFIG_KPROBES)  += kprobes.o kprobes-common.o patch.o
 ifdef CONFIG_THUMB2_KERNEL
 obj-$(CONFIG_KPROBES)  += kprobes-thumb.o
diff --git a/arch/arm/

Re: [revert request for commit 9fff2fa] Re: [git pull] signals pile 3

2012-10-14 Thread Al Viro

On Sun, Oct 14, 2012 at 06:26:40PM +0100, Al Viro wrote:
> and the last 3 make no sense whatsoever.  Note that on normal execve() we'll
> be going through the syscall return, so the userland will see 0 in there,
> no matter what do we do here.  Theoretically, it might've been done for
> ptrace sake (it will be able to observe the values in those registers before
> the tracee reaches userland),

Except that it won't be able to see what start_thread() puts in r0 either;
on successful exceve(2) we will store return value of sys_execve() (i.e. 0)
in regs->ARM_r0 before we get to any of the places where it could have
examine the sucker.  So what was that assignment for?  And as far as I can
see, ARM ELF ABI says that general register values on process startup are
undefined, so r1 and r2 assignments also seem to be pointless.  OTOH, they
predate the ELF conversion by quite a but - that code had been there since
1.x times, when we used to use a.out...  In any case, they were *not* going
to be usable as main() arguments - zero argc would make userland rather
unhappy.  I don't have arm libc sources from those times, but I'd expect
it to have all those suckers read from userland stack...

Russell, could you recall what those had been about?  I'm not sure if that
had been oopsable that far back (again, oops scenario is userland stack
page getting swapped out before we get to start_thread(), leading to
direct read from an absent page in start_thread() by plain ldr, without
anything in exception table about that insn), but it looks very odd
regardless of that problem.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 5/9] uprobes: allow arch-specific initialization

2012-10-14 Thread Rabin Vincent

Add a weak function for any architecture-specific initialization.  ARM
will use this to register the handlers for the undefined instructions it
uses to implement uprobes.

Signed-off-by: Rabin Vincent 
---
 include/linux/uprobes.h |1 +
 kernel/events/uprobes.c |   10 ++
 2 files changed, 11 insertions(+)

diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
index b4380ad..c3dc5de 100644
--- a/include/linux/uprobes.h
+++ b/include/linux/uprobes.h
@@ -130,6 +130,7 @@ extern int  arch_uprobe_exception_notify(struct 
notifier_block *self, unsigned l
 extern void arch_uprobe_abort_xol(struct arch_uprobe *aup, struct pt_regs 
*regs);
 extern bool __weak arch_uprobe_ignore(struct arch_uprobe *aup, struct pt_regs 
*regs);
 extern void __weak arch_uprobe_xol_copy(struct arch_uprobe *auprobe, void 
*vaddr);
+extern int __weak arch_uprobes_init(void);
 #else /* !CONFIG_UPROBES */
 struct uprobes_state {
 };
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index f7ff3a4..ca000a9 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1634,8 +1634,14 @@ static struct notifier_block uprobe_exception_nb = {
.priority   = INT_MAX-1,/* notified after kprobes, kgdb 
*/
 };
 
+int __weak __init arch_uprobes_init(void)
+{
+   return 0;
+}
+
 static int __init init_uprobes(void)
 {
+   int ret;
int i;
 
for (i = 0; i < UPROBES_HASH_SZ; i++) {
@@ -1643,6 +1649,10 @@ static int __init init_uprobes(void)
mutex_init(&uprobes_mmap_mutex[i]);
}
 
+   ret = arch_uprobes_init();
+   if (ret)
+   return ret;
+
return register_die_notifier(&uprobe_exception_nb);
 }
 module_init(init_uprobes);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 8/9] ARM: support uprobe handling

2012-10-14 Thread Rabin Vincent

Extend the kprobes code to handle user-space probes.  Much of the code
can be reused so currently the ARM uprobes code reuses the kprobes
structures.  The decode tables are reused, with the modification that
for those instruction that require custom decoding for uprobes, a new
element is added in the table to specify a custom decoder function.

Thumb is not handled.

Cc: Jon Medhurst 
Signed-off-by: Rabin Vincent 
---
 arch/arm/include/asm/kprobes.h   |   17 +---
 arch/arm/include/asm/probes.h|   23 +
 arch/arm/kernel/kprobes-arm.c|   27 +-
 arch/arm/kernel/kprobes-common.c |   63 ++
 arch/arm/kernel/kprobes-test.c   |   12 ++-
 arch/arm/kernel/kprobes-thumb.c  |   31 ---
 arch/arm/kernel/kprobes.c|2 +-
 arch/arm/kernel/kprobes.h|   36 +---
 arch/arm/kernel/uprobes-arm.c|  178 ++
 arch/arm/kernel/uprobes.h|   23 +
 10 files changed, 349 insertions(+), 63 deletions(-)
 create mode 100644 arch/arm/include/asm/probes.h
 create mode 100644 arch/arm/kernel/uprobes-arm.c
 create mode 100644 arch/arm/kernel/uprobes.h

diff --git a/arch/arm/include/asm/kprobes.h b/arch/arm/include/asm/kprobes.h
index f82ec22..53b1a80 100644
--- a/arch/arm/include/asm/kprobes.h
+++ b/arch/arm/include/asm/kprobes.h
@@ -27,22 +27,7 @@
 #define flush_insn_slot(p) do { } while (0)
 #define kretprobe_blacklist_size   0
 
-typedef u32 kprobe_opcode_t;
-
-struct kprobe;
-typedef void (kprobe_insn_handler_t)(struct kprobe *, struct pt_regs *);
-typedef unsigned long (kprobe_check_cc)(unsigned long);
-typedef void (kprobe_insn_singlestep_t)(struct kprobe *, struct pt_regs *);
-typedef void (kprobe_insn_fn_t)(void);
-
-/* Architecture specific copy of original instruction. */
-struct arch_specific_insn {
-   kprobe_opcode_t *insn;
-   kprobe_insn_handler_t   *insn_handler;
-   kprobe_check_cc *insn_check_cc;
-   kprobe_insn_singlestep_t*insn_singlestep;
-   kprobe_insn_fn_t*insn_fn;
-};
+#include 
 
 struct prev_kprobe {
struct kprobe *kp;
diff --git a/arch/arm/include/asm/probes.h b/arch/arm/include/asm/probes.h
new file mode 100644
index 000..df46994
--- /dev/null
+++ b/arch/arm/include/asm/probes.h
@@ -0,0 +1,23 @@
+#ifndef _ASM_PROBES_H
+#define _ASM_PROBES_H
+
+#ifdef CONFIG_KPROBES
+typedef u32 kprobe_opcode_t;
+
+struct kprobe;
+typedef void (kprobe_insn_handler_t)(struct kprobe *, struct pt_regs *);
+typedef unsigned long (kprobe_check_cc)(unsigned long);
+typedef void (kprobe_insn_singlestep_t)(struct kprobe *, struct pt_regs *);
+typedef void (kprobe_insn_fn_t)(void);
+
+/* Architecture specific copy of original instruction. */
+struct arch_specific_insn {
+   kprobe_opcode_t *insn;
+   kprobe_insn_handler_t   *insn_handler;
+   kprobe_check_cc *insn_check_cc;
+   kprobe_insn_singlestep_t*insn_singlestep;
+   kprobe_insn_fn_t*insn_fn;
+};
+#endif
+
+#endif
diff --git a/arch/arm/kernel/kprobes-arm.c b/arch/arm/kernel/kprobes-arm.c
index 8a30c89..d9cf0e2 100644
--- a/arch/arm/kernel/kprobes-arm.c
+++ b/arch/arm/kernel/kprobes-arm.c
@@ -62,6 +62,7 @@
 #include 
 #include 
 
+#include "uprobes.h"
 #include "kprobes.h"
 
 #define sign_extend(x, signbit) ((x) | (0 - ((x) & (1 << (signbit)
@@ -545,31 +546,37 @@ static const union decode_item 
arm__000x_1xx1_table[] = {
 
/* LDRD (register)   000x x0x0    1101  */
/* STRD (register)   000x x0x0      */
+   UDECODE_NEXT (decode_pc_ro)
DECODE_EMULATEX (0x0e5000d0, 0x00d0, emulate_ldrdstrd,
 REGS(NOPCWB, NOPCX, 0, 0, 
NOPC)),
 
/* LDRD (immediate)  000x x1x0    1101  */
/* STRD (immediate)  000x x1x0      */
+   UDECODE_NEXT (decode_pc_ro)
DECODE_EMULATEX (0x0e5000d0, 0x004000d0, emulate_ldrdstrd,
 REGS(NOPCWB, NOPCX, 0, 0, 0)),
 
/* STRH (register)   000x x0x0    1011  */
+   UDECODE_NEXT (decode_pc_ro)
DECODE_EMULATEX (0x0e5000f0, 0x00b0, emulate_str,
 REGS(NOPCWB, NOPC, 0, 0, 
NOPC)),
 
/* LDRH (register)   000x x0x1    1011  */
/* LDRSB (register)  000x x0x1    1101  */
/* LDRSH (register)  000x x0x1      */
+   UDECODE_NEXT (decode_pc_ro)
DECODE_EMULATEX (0x0e500090, 0x00100090, emulate_ldr,
 REGS(NOPCWB, NOPC, 0, 0, 
NOPC)),
 
/* STRH (immediate)  000x x1x0    1011  */
+   UDECODE_NEXT (decode_pc_ro)
DECODE_EMULATEX (0x0e5000f0

[PATCH 7/9] uprobes: add arch write opcode hook

2012-10-14 Thread Rabin Vincent

Allow arches to write the opcode with a custom function.  ARM needs to
customize the swbp instruction depending on the condition code of the
instruction it replaces.

Signed-off-by: Rabin Vincent 
---
 include/linux/uprobes.h |3 +++
 kernel/events/uprobes.c |8 +++-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
index c3dc5de..35b9490 100644
--- a/include/linux/uprobes.h
+++ b/include/linux/uprobes.h
@@ -131,6 +131,9 @@ extern void arch_uprobe_abort_xol(struct arch_uprobe *aup, 
struct pt_regs *regs)
 extern bool __weak arch_uprobe_ignore(struct arch_uprobe *aup, struct pt_regs 
*regs);
 extern void __weak arch_uprobe_xol_copy(struct arch_uprobe *auprobe, void 
*vaddr);
 extern int __weak arch_uprobes_init(void);
+extern void __weak arch_uprobe_write_opcode(struct arch_uprobe *auprobe,
+   void *vaddr,
+   uprobe_opcode_t opcode);
 #else /* !CONFIG_UPROBES */
 struct uprobes_state {
 };
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 8c52f93..95ea618 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -203,6 +203,12 @@ bool __weak is_swbp_insn(uprobe_opcode_t *insn)
  * have fixed length instructions.
  */
 
+void __weak arch_uprobe_write_opcode(struct arch_uprobe *auprobe, void *vaddr,
+uprobe_opcode_t opcode)
+{
+   memcpy(vaddr, &opcode, UPROBE_SWBP_INSN_SIZE);
+}
+
 /*
  * write_opcode - write the opcode at a given virtual address.
  * @auprobe: arch breakpointing information.
@@ -242,7 +248,7 @@ retry:
vaddr_new = kmap_atomic(new_page);
 
memcpy(vaddr_new, vaddr_old, PAGE_SIZE);
-   memcpy(vaddr_new + (vaddr & ~PAGE_MASK), &opcode, 
UPROBE_SWBP_INSN_SIZE);
+   arch_uprobe_write_opcode(auprobe, vaddr_new + (vaddr & ~PAGE_MASK), 
opcode);
 
kunmap_atomic(vaddr_new);
kunmap_atomic(vaddr_old);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/9] uprobes: allow ignoring of probe hits

2012-10-14 Thread Rabin Vincent

Allow arches to decided to ignore a probe hit.  ARM will use this to
only call handlers if the conditions to execute a conditionally executed
instruction are satisfied.

Signed-off-by: Rabin Vincent 
---
 include/linux/uprobes.h |1 +
 kernel/events/uprobes.c |   14 +-
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
index ac90704..da21b66 100644
--- a/include/linux/uprobes.h
+++ b/include/linux/uprobes.h
@@ -128,6 +128,7 @@ extern int  arch_uprobe_post_xol(struct arch_uprobe *aup, 
struct pt_regs *regs);
 extern bool arch_uprobe_xol_was_trapped(struct task_struct *tsk);
 extern int  arch_uprobe_exception_notify(struct notifier_block *self, unsigned 
long val, void *data);
 extern void arch_uprobe_abort_xol(struct arch_uprobe *aup, struct pt_regs 
*regs);
+extern bool __weak arch_uprobe_ignore(struct arch_uprobe *aup, struct pt_regs 
*regs);
 #else /* !CONFIG_UPROBES */
 struct uprobes_state {
 };
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index db4e3ab..a0e1a38 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1419,6 +1419,11 @@ static void mmf_recalc_uprobes(struct mm_struct *mm)
clear_bit(MMF_HAS_UPROBES, &mm->flags);
 }
 
+bool __weak arch_uprobe_ignore(struct arch_uprobe *aup, struct pt_regs *regs)
+{
+   return false;
+}
+
 static struct uprobe *find_active_uprobe(unsigned long bp_vaddr, int *is_swbp)
 {
struct mm_struct *mm = current->mm;
@@ -1469,6 +1474,7 @@ static void handle_swbp(struct pt_regs *regs)
struct uprobe *uprobe;
unsigned long bp_vaddr;
int uninitialized_var(is_swbp);
+   bool ignored = false;
 
bp_vaddr = uprobe_get_swbp_addr(regs);
uprobe = find_active_uprobe(bp_vaddr, &is_swbp);
@@ -1499,6 +1505,12 @@ static void handle_swbp(struct pt_regs *regs)
goto cleanup_ret;
}
utask->active_uprobe = uprobe;
+
+   if (arch_uprobe_ignore(&uprobe->arch, regs)) {
+   ignored = true;
+   goto cleanup_ret;
+   }
+
handler_chain(uprobe, regs);
if (uprobe->flags & UPROBE_SKIP_SSTEP && can_skip_sstep(uprobe, regs))
goto cleanup_ret;
@@ -1514,7 +1526,7 @@ cleanup_ret:
utask->active_uprobe = NULL;
utask->state = UTASK_RUNNING;
}
-   if (!(uprobe->flags & UPROBE_SKIP_SSTEP))
+   if (!ignored && !(uprobe->flags & UPROBE_SKIP_SSTEP))
 
/*
 * cannot singlestep; cannot skip instruction;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/9] uprobes: move function declarations out of arch

2012-10-14 Thread Rabin Vincent

It seems odd to keep the function declarations in the arch header where
they will need to be copy/pasted verbatim across arches.  Move them to
the common header.

Signed-off-by: Rabin Vincent 
---
 arch/x86/include/asm/uprobes.h |6 --
 include/linux/uprobes.h|8 
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/uprobes.h b/arch/x86/include/asm/uprobes.h
index 8ff8be7..b20b4d6 100644
--- a/arch/x86/include/asm/uprobes.h
+++ b/arch/x86/include/asm/uprobes.h
@@ -49,10 +49,4 @@ struct arch_uprobe_task {
unsigned intsaved_tf;
 };
 
-extern int  arch_uprobe_analyze_insn(struct arch_uprobe *aup, struct mm_struct 
*mm, unsigned long addr);
-extern int  arch_uprobe_pre_xol(struct arch_uprobe *aup, struct pt_regs *regs);
-extern int  arch_uprobe_post_xol(struct arch_uprobe *aup, struct pt_regs 
*regs);
-extern bool arch_uprobe_xol_was_trapped(struct task_struct *tsk);
-extern int  arch_uprobe_exception_notify(struct notifier_block *self, unsigned 
long val, void *data);
-extern void arch_uprobe_abort_xol(struct arch_uprobe *aup, struct pt_regs 
*regs);
 #endif /* _ASM_UPROBES_H */
diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
index e6f0331..ac90704 100644
--- a/include/linux/uprobes.h
+++ b/include/linux/uprobes.h
@@ -30,6 +30,7 @@
 struct vm_area_struct;
 struct mm_struct;
 struct inode;
+struct notifier_block;
 
 #ifdef CONFIG_ARCH_SUPPORTS_UPROBES
 # include 
@@ -120,6 +121,13 @@ extern void uprobe_notify_resume(struct pt_regs *regs);
 extern bool uprobe_deny_signal(void);
 extern bool __weak arch_uprobe_skip_sstep(struct arch_uprobe *aup, struct 
pt_regs *regs);
 extern void uprobe_clear_state(struct mm_struct *mm);
+extern void uprobe_reset_state(struct mm_struct *mm);
+extern int  arch_uprobe_analyze_insn(struct arch_uprobe *aup, struct mm_struct 
*mm,unsigned long addr);
+extern int  arch_uprobe_pre_xol(struct arch_uprobe *aup, struct pt_regs *regs);
+extern int  arch_uprobe_post_xol(struct arch_uprobe *aup, struct pt_regs 
*regs);
+extern bool arch_uprobe_xol_was_trapped(struct task_struct *tsk);
+extern int  arch_uprobe_exception_notify(struct notifier_block *self, unsigned 
long val, void *data);
+extern void arch_uprobe_abort_xol(struct arch_uprobe *aup, struct pt_regs 
*regs);
 #else /* !CONFIG_UPROBES */
 struct uprobes_state {
 };
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 5/5] snd-ice1712: Fix resume on ice1724

2012-10-14 Thread Ondrej Zary

set_pro_rate() is called from hw_params() but not from prepare(), breaking 
running PCM on suspend/resume.
Call it from prepare() if PCM was suspended to fix the problem.

Signed-off-by: Ondrej Zary 
---
 sound/pci/ice1712/ice1724.c |   14 ++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/sound/pci/ice1712/ice1724.c b/sound/pci/ice1712/ice1724.c
index 0eb7ec6..ade3354 100644
--- a/sound/pci/ice1712/ice1724.c
+++ b/sound/pci/ice1712/ice1724.c
@@ -783,6 +783,13 @@ static int snd_vt1724_playback_pro_prepare(struct 
snd_pcm_substream *substream)
struct snd_ice1712 *ice = snd_pcm_substream_chip(substream);
unsigned char val;
unsigned int size;
+   int err;
+
+   if (substream->runtime->status->state == SNDRV_PCM_STATE_SUSPENDED) {
+   err = snd_vt1724_set_pro_rate(ice, substream->runtime->rate, 0);
+   if (err < 0)
+   return err;
+   }
 
spin_lock_irq(&ice->reg_lock);
val = (8 - substream->runtime->channels) >> 1;
@@ -853,6 +860,13 @@ static int snd_vt1724_pcm_prepare(struct snd_pcm_substream 
*substream)
 {
struct snd_ice1712 *ice = snd_pcm_substream_chip(substream);
const struct vt1724_pcm_reg *reg = substream->runtime->private_data;
+   int err;
+
+   if (substream->runtime->status->state == SNDRV_PCM_STATE_SUSPENDED) {
+   err = snd_vt1724_set_pro_rate(ice, substream->runtime->rate, 0);
+   if (err < 0)
+   return err;
+   }
 
spin_lock_irq(&ice->reg_lock);
outl(substream->runtime->dma_addr, ice->profi_port + reg->addr);
-- 
Ondrej Zary

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/5] snd-ice1712: Add Philips PSC724 Ultimate Edge

2012-10-14 Thread Ondrej Zary

Add psc724 subdriver to snd-ice1712 that provides full support for
Philips PSC724 Ultimate Edge sound cards.

Signed-off-by: Ondrej Zary 
---
 sound/pci/Kconfig   |2 +-
 sound/pci/ice1712/Makefile  |2 +-
 sound/pci/ice1712/ice1724.c |4 +-
 sound/pci/ice1712/psc724.c  |  464 +++
 sound/pci/ice1712/psc724.h  |   13 ++
 5 files changed, 482 insertions(+), 3 deletions(-)
 create mode 100644 sound/pci/ice1712/psc724.c
 create mode 100644 sound/pci/ice1712/psc724.h

diff --git a/sound/pci/Kconfig b/sound/pci/Kconfig
index ff3af6e..5df1635 100644
--- a/sound/pci/Kconfig
+++ b/sound/pci/Kconfig
@@ -630,7 +630,7 @@ config SND_ICE1724
  AudioTrak Prodigy 192, 7.1 (HIFI/LT/XT), HD2; Hercules
  Fortissimo IV; ESI Juli@; Pontis MS300; EGO-SYS WaveTerminal
  192M; Albatron K8X800 Pro II; Chaintech ZNF3-150/250, 9CJS,
- AV-710; Shuttle SN25P.
+ AV-710; Shuttle SN25P; Philips PSC724 Ultimate Edge.
 
  To compile this driver as a module, choose M here: the module
  will be called snd-ice1724.
diff --git a/sound/pci/ice1712/Makefile b/sound/pci/ice1712/Makefile
index f7ce33f..7e50c13 100644
--- a/sound/pci/ice1712/Makefile
+++ b/sound/pci/ice1712/Makefile
@@ -5,7 +5,7 @@
 
 snd-ice17xx-ak4xxx-objs := ak4xxx.o
 snd-ice1712-objs := ice1712.o delta.o hoontech.o ews.o
-snd-ice1724-objs := ice1724.o amp.o revo.o aureon.o vt1720_mobo.o pontis.o 
prodigy192.o prodigy_hifi.o juli.o phase.o wtm.o se.o maya44.o quartet.o
+snd-ice1724-objs := ice1724.o amp.o revo.o aureon.o vt1720_mobo.o pontis.o 
prodigy192.o prodigy_hifi.o juli.o phase.o wtm.o se.o maya44.o quartet.o 
psc724.o wm8766.o wm8776.o
 
 # Toplevel Module Dependency
 obj-$(CONFIG_SND_ICE1712) += snd-ice1712.o snd-ice17xx-ak4xxx.o
diff --git a/sound/pci/ice1712/ice1724.c b/sound/pci/ice1712/ice1724.c
index a529d30..0eb7ec6 100644
--- a/sound/pci/ice1712/ice1724.c
+++ b/sound/pci/ice1712/ice1724.c
@@ -54,6 +54,7 @@
 #include "wtm.h"
 #include "se.h"
 #include "quartet.h"
+#include "psc724.h"
 
 MODULE_AUTHOR("Jaroslav Kysela ");
 MODULE_DESCRIPTION("VIA ICEnsemble ICE1724/1720 (Envy24HT/PT)");
@@ -2257,6 +2258,7 @@ static struct snd_ice1712_card_info *card_tables[] 
__devinitdata = {
snd_vt1724_se_cards,
snd_vt1724_qtet_cards,
snd_vt1724_ooaoo_cards,
+   snd_vt1724_psc724_cards,
NULL,
 };
 
@@ -2372,7 +2374,7 @@ static int __devinit snd_vt1724_read_eeprom(struct 
snd_ice1712 *ice,
return -EIO;
}
ice->eeprom.version = snd_vt1724_read_i2c(ice, dev, 0x05);
-   if (ice->eeprom.version != 2)
+   if (ice->eeprom.version != 1 && ice->eeprom.version != 2)
printk(KERN_WARNING "ice1724: Invalid EEPROM version %i\n",
   ice->eeprom.version);
size = ice->eeprom.size - 6;
diff --git a/sound/pci/ice1712/psc724.c b/sound/pci/ice1712/psc724.c
new file mode 100644
index 000..5a4abe7
--- /dev/null
+++ b/sound/pci/ice1712/psc724.c
@@ -0,0 +1,464 @@
+/*
+ *   ALSA driver for ICEnsemble VT1724 (Envy24HT)
+ *
+ *   Lowlevel functions for Philips PSC724 Ultimate Edge
+ *
+ * Copyright (c) 2012 Ondrej Zary 
+ *
+ *   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of the GNU General Public License as published by
+ *   the Free Software Foundation; either version 2 of the License, or
+ *   (at your option) any later version.
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *   GNU General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program; if not, write to the Free Software
+ *   Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+#include "ice1712.h"
+#include "envy24ht.h"
+#include "psc724.h"
+#include "wm8766.h"
+#include "wm8776.h"
+
+struct psc724_spec {
+   struct snd_wm8766 wm8766;
+   struct snd_wm8776 wm8776;
+   bool mute_all, jack_detect;
+   struct snd_ice1712 *ice;
+   struct delayed_work hp_work;
+   bool hp_connected;
+};
+
+//
+/*  PHILIPS PSC724 ULTIMATE EDGE*/
+//
+/*
+ *  VT1722 (Envy24GT) - 6 outputs, 4 inputs (only 2 used), 24-bit/96kHz
+ *
+ *  system configuration ICE_EEP2_SYSCONF=0x42
+ *XIN1 49.152MHz
+ *no MPU401
+ *one stereo ADC, no S/PDIF receiver
+ *three stereo DACs (FRONT, REAR, CENTER+LFE)
+ *
+ *  AC-Link configuration ICE_EEP2_ACLINK=0x80
+ *use I2S, not AC97
+ *
+ *  I2S converters feature ICE_EEP2_I2S=0x30
+ *I2S

[PATCH 3/5] snd-ice1712: Add Wolfson Microelectronics WM8776 codec support

2012-10-14 Thread Ondrej Zary

Needed by Philips PSC724 subdriver. The code does not contain any
card-specific bits so other ice17xx cards using this codec could be
converted to use this generic code.

Signed-off-by: Ondrej Zary 
---
 sound/pci/ice1712/wm8776.c |  632 
 sound/pci/ice1712/wm8776.h |  226 
 2 files changed, 858 insertions(+), 0 deletions(-)
 create mode 100644 sound/pci/ice1712/wm8776.c
 create mode 100644 sound/pci/ice1712/wm8776.h

diff --git a/sound/pci/ice1712/wm8776.c b/sound/pci/ice1712/wm8776.c
new file mode 100644
index 000..dc333ce
--- /dev/null
+++ b/sound/pci/ice1712/wm8776.c
@@ -0,0 +1,632 @@
+/*
+ *   ALSA driver for ICEnsemble VT17xx
+ *
+ *   Lowlevel functions for WM8776 codec
+ *
+ * Copyright (c) 2012 Ondrej Zary 
+ *
+ *   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of the GNU General Public License as published by
+ *   the Free Software Foundation; either version 2 of the License, or
+ *   (at your option) any later version.
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *   GNU General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program; if not, write to the Free Software
+ *   Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include "wm8776.h"
+
+/* low-level access */
+
+static void snd_wm8776_write(struct snd_wm8776 *wm, u16 addr, u16 data)
+{
+   u8 bus_addr = addr << 1 | data >> 8;/* addr + 9th data bit */
+   u8 bus_data = data & 0xff;  /* remaining 8 data bits */
+
+   if (addr < WM8776_REG_RESET)
+   wm->regs[addr] = data;
+   wm->ops.write(wm, bus_addr, bus_data);
+}
+
+/* register-level functions */
+
+static void snd_wm8776_activate_ctl(struct snd_wm8776 *wm, char *ctl_name,
+   bool active)
+{
+   struct snd_card *card = wm->card;
+   struct snd_kcontrol *kctl;
+   struct snd_kcontrol_volatile *vd;
+   struct snd_ctl_elem_id elem_id;
+   unsigned int index_offset;
+
+   memset(&elem_id, 0, sizeof(elem_id));
+   strncpy(elem_id.name, ctl_name, sizeof(elem_id.name));
+   elem_id.iface = SNDRV_CTL_ELEM_IFACE_MIXER;
+   kctl = snd_ctl_find_id(card, &elem_id);
+   if (!kctl)
+   return;
+   index_offset = snd_ctl_get_ioff(kctl, &kctl->id);
+   vd = &kctl->vd[index_offset];
+   if (active)
+   vd->access &= ~SNDRV_CTL_ELEM_ACCESS_INACTIVE;
+   else
+   vd->access |= SNDRV_CTL_ELEM_ACCESS_INACTIVE;
+   snd_ctl_notify(card, SNDRV_CTL_EVENT_MASK_INFO, &kctl->id);
+}
+
+static void snd_wm8776_update_agc_ctl(struct snd_wm8776 *wm)
+{
+   int i, flags_on = 0, flags_off = 0;
+
+   switch (wm->agc_mode) {
+   case WM8776_AGC_OFF:
+   flags_off = WM8776_FLAG_LIM | WM8776_FLAG_ALC;
+   break;
+   case WM8776_AGC_LIM:
+   flags_off = WM8776_FLAG_ALC;
+   flags_on = WM8776_FLAG_LIM;
+   break;
+   case WM8776_AGC_ALC_R:
+   case WM8776_AGC_ALC_L:
+   case WM8776_AGC_ALC_STEREO:
+   flags_off = WM8776_FLAG_LIM;
+   flags_on = WM8776_FLAG_ALC;
+   break;
+   }
+
+   for (i = 0; i < WM8776_CTL_COUNT; i++)
+   if (wm->ctl[i].flags & flags_off)
+   snd_wm8776_activate_ctl(wm, wm->ctl[i].name, false);
+   else if (wm->ctl[i].flags & flags_on)
+   snd_wm8776_activate_ctl(wm, wm->ctl[i].name, true);
+}
+
+static void snd_wm8776_set_agc(struct snd_wm8776 *wm, u16 agc, u16 nothing)
+{
+   u16 alc1 = wm->regs[WM8776_REG_ALCCTRL1] & ~WM8776_ALC1_LCT_MASK;
+   u16 alc2 = wm->regs[WM8776_REG_ALCCTRL2] & ~WM8776_ALC2_LCEN;
+
+   switch (agc) {
+   case 0: /* Off */
+   wm->agc_mode = WM8776_AGC_OFF;
+   break;
+   case 1: /* Limiter */
+   alc2 |= WM8776_ALC2_LCEN;
+   wm->agc_mode = WM8776_AGC_LIM;
+   break;
+   case 2: /* ALC Right */
+   alc1 |= WM8776_ALC1_LCSEL_ALCR;
+   alc2 |= WM8776_ALC2_LCEN;
+   wm->agc_mode = WM8776_AGC_ALC_R;
+   break;
+   case 3: /* ALC Left */
+   alc1 |= WM8776_ALC1_LCSEL_ALCL;
+   alc2 |= WM8776_ALC2_LCEN;
+   wm->agc_mode = WM8776_AGC_ALC_L;
+   break;
+   case 4: /* ALC Stereo */
+   alc1 |= WM8776_ALC1_LCSEL_ALCSTEREO;
+   alc2 |= WM8776_ALC2_LCEN;
+   wm->agc_mode = WM8776_AGC_ALC_STEREO;
+   break;
+   }
+   snd_wm8776_write(wm, WM8776_RE

[PATCH 2/5] snd-ice1712: Add Wolfson Microelectronics WM8766 codec support

2012-10-14 Thread Ondrej Zary

Needed by Philips PSC724 subdriver. The code does not contain any
card-specific bits so other ice17xx cards using this codec could be
converted to use this generic code.

Signed-off-by: Ondrej Zary 
---
 sound/pci/ice1712/wm8766.c |  361 
 sound/pci/ice1712/wm8766.h |  163 
 2 files changed, 524 insertions(+), 0 deletions(-)
 create mode 100644 sound/pci/ice1712/wm8766.c
 create mode 100644 sound/pci/ice1712/wm8766.h

diff --git a/sound/pci/ice1712/wm8766.c b/sound/pci/ice1712/wm8766.c
new file mode 100644
index 000..8072ade
--- /dev/null
+++ b/sound/pci/ice1712/wm8766.c
@@ -0,0 +1,361 @@
+/*
+ *   ALSA driver for ICEnsemble VT17xx
+ *
+ *   Lowlevel functions for WM8766 codec
+ *
+ * Copyright (c) 2012 Ondrej Zary 
+ *
+ *   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of the GNU General Public License as published by
+ *   the Free Software Foundation; either version 2 of the License, or
+ *   (at your option) any later version.
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *   GNU General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program; if not, write to the Free Software
+ *   Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include "wm8766.h"
+
+/* low-level access */
+
+static void snd_wm8766_write(struct snd_wm8766 *wm, u16 addr, u16 data)
+{
+   if (addr < WM8766_REG_RESET)
+   wm->regs[addr] = data;
+   wm->ops.write(wm, addr, data);
+}
+
+/* mixer controls */
+
+static const DECLARE_TLV_DB_SCALE(wm8766_tlv, -12750, 50, 1);
+
+static struct snd_wm8766_ctl snd_wm8766_default_ctl[WM8766_CTL_COUNT] = {
+   [WM8766_CTL_CH1_VOL] = {
+   .name = "Channel 1 Playback Volume",
+   .type = SNDRV_CTL_ELEM_TYPE_INTEGER,
+   .tlv = wm8766_tlv,
+   .reg1 = WM8766_REG_DACL1,
+   .reg2 = WM8766_REG_DACR1,
+   .mask1 = WM8766_VOL_MASK,
+   .mask2 = WM8766_VOL_MASK,
+   .max = 0xff,
+   .flags = WM8766_FLAG_STEREO | WM8766_FLAG_VOL_UPDATE,
+   },
+   [WM8766_CTL_CH2_VOL] = {
+   .name = "Channel 2 Playback Volume",
+   .type = SNDRV_CTL_ELEM_TYPE_INTEGER,
+   .tlv = wm8766_tlv,
+   .reg1 = WM8766_REG_DACL2,
+   .reg2 = WM8766_REG_DACR2,
+   .mask1 = WM8766_VOL_MASK,
+   .mask2 = WM8766_VOL_MASK,
+   .max = 0xff,
+   .flags = WM8766_FLAG_STEREO | WM8766_FLAG_VOL_UPDATE,
+   },
+   [WM8766_CTL_CH3_VOL] = {
+   .name = "Channel 3 Playback Volume",
+   .type = SNDRV_CTL_ELEM_TYPE_INTEGER,
+   .tlv = wm8766_tlv,
+   .reg1 = WM8766_REG_DACL3,
+   .reg2 = WM8766_REG_DACR3,
+   .mask1 = WM8766_VOL_MASK,
+   .mask2 = WM8766_VOL_MASK,
+   .max = 0xff,
+   .flags = WM8766_FLAG_STEREO | WM8766_FLAG_VOL_UPDATE,
+   },
+   [WM8766_CTL_CH1_SW] = {
+   .name = "Channel 1 Playback Switch",
+   .type = SNDRV_CTL_ELEM_TYPE_BOOLEAN,
+   .reg1 = WM8766_REG_DACCTRL2,
+   .mask1 = WM8766_DAC2_MUTE1,
+   .flags = WM8766_FLAG_INVERT,
+   },
+   [WM8766_CTL_CH2_SW] = {
+   .name = "Channel 2 Playback Switch",
+   .type = SNDRV_CTL_ELEM_TYPE_BOOLEAN,
+   .reg1 = WM8766_REG_DACCTRL2,
+   .mask1 = WM8766_DAC2_MUTE2,
+   .flags = WM8766_FLAG_INVERT,
+   },
+   [WM8766_CTL_CH3_SW] = {
+   .name = "Channel 3 Playback Switch",
+   .type = SNDRV_CTL_ELEM_TYPE_BOOLEAN,
+   .reg1 = WM8766_REG_DACCTRL2,
+   .mask1 = WM8766_DAC2_MUTE3,
+   .flags = WM8766_FLAG_INVERT,
+   },
+   [WM8766_CTL_PHASE1_SW] = {
+   .name = "Channel 1 Phase Invert Playback Switch",
+   .type = SNDRV_CTL_ELEM_TYPE_BOOLEAN,
+   .reg1 = WM8766_REG_IFCTRL,
+   .mask1 = WM8766_PHASE_INVERT1,
+   },
+   [WM8766_CTL_PHASE2_SW] = {
+   .name = "Channel 2 Phase Invert Playback Switch",
+   .type = SNDRV_CTL_ELEM_TYPE_BOOLEAN,
+   .reg1 = WM8766_REG_IFCTRL,
+   .mask1 = WM8766_PHASE_INVERT2,
+   },
+   [WM8766_CTL_PHASE3_SW] = {
+   .name = "Channel 3 Phase Invert Playback Switch",
+   .type = SNDRV_CTL_ELEM_TYPE_BOOLEAN,
+   .reg1 = WM8766_REG_IFCTRL,
+   .mask1 = WM8766_PHASE_INVERT3,
+   },
+   [WM8766_

[PATCH 1/5] snd-ice1712: add chip_exit callback

2012-10-14 Thread Ondrej Zary

Add chip_exit callback to allow card subdrivers to do cleanup work on module
removal.

Needed by Philips PSC724 subdriver to cancel delayed work.

Signed-off-by: Ondrej Zary 
---
 sound/pci/ice1712/ice1712.c |8 +++-
 sound/pci/ice1712/ice1712.h |3 +++
 sound/pci/ice1712/ice1724.c |8 +++-
 3 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/sound/pci/ice1712/ice1712.c b/sound/pci/ice1712/ice1712.c
index 5be2e12..f42b5b1 100644
--- a/sound/pci/ice1712/ice1712.c
+++ b/sound/pci/ice1712/ice1712.c
@@ -2686,6 +2686,7 @@ static int __devinit snd_ice1712_probe(struct pci_dev 
*pci,
for (tbl = card_tables; *tbl; tbl++) {
for (c = *tbl; c->subvendor; c++) {
if (c->subvendor == ice->eeprom.subvendor) {
+   ice->card_info = c;
strcpy(card->shortname, c->name);
if (c->driver) /* specific driver? */
strcpy(card->driver, c->driver);
@@ -2799,7 +2800,12 @@ static int __devinit snd_ice1712_probe(struct pci_dev 
*pci,
 
 static void __devexit snd_ice1712_remove(struct pci_dev *pci)
 {
-   snd_card_free(pci_get_drvdata(pci));
+   struct snd_card *card = pci_get_drvdata(pci);
+   struct snd_ice1712 *ice = card->private_data;
+
+   if (ice->card_info && ice->card_info->chip_exit)
+   ice->card_info->chip_exit(ice);
+   snd_card_free(card);
pci_set_drvdata(pci, NULL);
 }
 
diff --git a/sound/pci/ice1712/ice1712.h b/sound/pci/ice1712/ice1712.h
index 0da778a..79f8aeb 100644
--- a/sound/pci/ice1712/ice1712.h
+++ b/sound/pci/ice1712/ice1712.h
@@ -288,6 +288,7 @@ struct snd_ice1712_spdif {
} ops;
 };
 
+struct snd_ice1712_card_info;
 
 struct snd_ice1712 {
unsigned long conp_dma_size;
@@ -324,6 +325,7 @@ struct snd_ice1712 {
struct snd_info_entry *proc_entry;
 
struct snd_ice1712_eeprom eeprom;
+   struct snd_ice1712_card_info *card_info;
 
unsigned int pro_volumes[20];
unsigned int omni:1;/* Delta Omni I/O */
@@ -517,6 +519,7 @@ struct snd_ice1712_card_info {
char *model;
char *driver;
int (*chip_init)(struct snd_ice1712 *);
+   void (*chip_exit)(struct snd_ice1712 *);
int (*build_controls)(struct snd_ice1712 *);
unsigned int no_mpu401:1;
unsigned int mpu401_1_info_flags;
diff --git a/sound/pci/ice1712/ice1724.c b/sound/pci/ice1712/ice1724.c
index bed9f34..a529d30 100644
--- a/sound/pci/ice1712/ice1724.c
+++ b/sound/pci/ice1712/ice1724.c
@@ -2348,6 +2348,7 @@ static int __devinit snd_vt1724_read_eeprom(struct 
snd_ice1712 *ice,
ice->eeprom.subvendor = c->subvendor;
} else if (c->subvendor != ice->eeprom.subvendor)
continue;
+   ice->card_info = c;
if (!c->eeprom_size || !c->eeprom_data)
goto found;
/* if the EEPROM is given by the driver, use it */
@@ -2788,7 +2789,12 @@ __found:
 
 static void __devexit snd_vt1724_remove(struct pci_dev *pci)
 {
-   snd_card_free(pci_get_drvdata(pci));
+   struct snd_card *card = pci_get_drvdata(pci);
+   struct snd_ice1712 *ice = card->private_data;
+
+   if (ice->card_info && ice->card_info->chip_exit)
+   ice->card_info->chip_exit(ice);
+   snd_card_free(card);
pci_set_drvdata(pci, NULL);
 }
 
-- 
Ondrej Zary

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/5] snd-ice1712: Add Philips PSC724 Ultimate Edge

2012-10-14 Thread Ondrej Zary

Hello,
this patch series adds full support for Philips PSC724 Ultimate Edge sound
card to snd-ice1712 driver. Unlike other snd-ice1712 subdrivers, the codec
code is splitted into separate files, reusable by other ice1712 subdrivers.

Working: all analog outputs (front, rear, center+lfe, headphone) and inputs
(front mic, rear mic, line, cd, aux), mixer, headphone jack detection,
suspend/resume

Untested: SPDIF

-- 
Ondrej Zary
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 >

1 - 100 of 338 matches

Mail list logo