perf fuzzer crash [PATCH] perf: Get group events reference before moving the group

2015-01-15 Thread Jiri Olsa
hi Vince,
I was able to reproduce the issue you described in:
  http://marc.info/?l=linux-kernel=141806390822670=2

I might have found one way that could lead to screwing up
context's refcounts.. could you please try attached patch?

I'm now on 2 days of no crash while it used to happen
3 times a day before.

Or you can use following git branch:
  git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
  perf/trinity_fix

thanks,
jirka


---
We need to make sure, that no event in the group lost
the last reference and gets removed from the context
during the group move in perf syscall.

This could happen if the child exits and calls put_event
on the parent event which got already closed, like in
following scenario:

  - T1 creates software event E1
  - T1 creates other software events as group with E1 as group leader
  - T1 forks T2
  - T2 has cloned E1 event that holds reference on E1
  - T1 closes event within E1 group (say E3), the event stays alive
due to the T2 reference
  - following happens concurently:
A) T1 creates hardware event E2 with groupleader E1
B) T2 exits

ad A) T1 triggers the E1 group move into hardware context:
mutex_lock(E1->ctx)
  - remove E1 group only from the E1->ctx context, leaving
the goup links untouched
mutex_unlock(E1->ctx)
mutex_lock(E2->ctx)
  - install E1 group into E2->ctx using the E1 group links
mutex_unlock(E2->ctx)

ad B) put_event(E3) is called and E3 is removed from E1->ctx
  completely, including group links

If 'A' and 'B' races, we will get unbalanced refcounts,
because of removed group links.

Adding get_group/put_group functions to handle the event
ref's increase/decrease for the whole group.

Signed-off-by: Jiri Olsa 
---
 kernel/events/core.c | 61 
 1 file changed, 61 insertions(+)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index af0a5ba4e21d..1922bae9f24e 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7250,6 +7250,55 @@ out:
return ret;
 }
 
+static void put_group(struct perf_event **group_arr)
+{
+   struct perf_event *event;
+   int i = 0;
+
+   while (event = group_arr[i++]) {
+   put_event(event);
+   }
+
+   kfree(group_arr);
+}
+
+static int get_group(struct perf_event *leader, struct perf_event ***group_arr)
+{
+   struct perf_event_context *ctx = leader->ctx;
+   struct perf_event *sibling, **arr = NULL;
+   int i = 0, err = -ENOMEM;
+   size_t size;
+
+   if (!atomic_long_inc_not_zero(>refcount))
+   return -EINVAL;
+
+   mutex_lock(>mutex);
+   /* + 1 for leader and +1 for final NULL */
+   size = (leader->nr_siblings + 2) * sizeof(leader);
+
+   arr = *group_arr = kzalloc(size, GFP_KERNEL);
+   if (!arr)
+   goto err;
+
+   err = -EINVAL;
+   arr[i++] = leader;
+
+   list_for_each_entry(sibling, >sibling_list, group_entry) {
+   if (!atomic_long_inc_not_zero(>refcount))
+   goto err;
+
+   arr[i++] = sibling;
+   }
+
+   mutex_unlock(>mutex);
+   return 0;
+err:
+   mutex_unlock(>mutex);
+   if (arr)
+   put_group(arr);
+   return err;
+}
+
 /**
  * sys_perf_event_open - open a performance event, associate it to a task/cpu
  *
@@ -7263,6 +7312,7 @@ SYSCALL_DEFINE5(perf_event_open,
pid_t, pid, int, cpu, int, group_fd, unsigned long, flags)
 {
struct perf_event *group_leader = NULL, *output_event = NULL;
+   struct perf_event **group_arr = NULL;
struct perf_event *event, *sibling;
struct perf_event_attr attr;
struct perf_event_context *ctx;
@@ -7443,6 +7493,12 @@ SYSCALL_DEFINE5(perf_event_open,
goto err_context;
}
 
+   if (move_group) {
+   err = get_group(group_leader, _arr);
+   if (err)
+   goto err_context;
+   }
+
event_file = anon_inode_getfile("[perf_event]", _fops, event,
f_flags);
if (IS_ERR(event_file)) {
@@ -7490,6 +7546,9 @@ SYSCALL_DEFINE5(perf_event_open,
perf_unpin_context(ctx);
mutex_unlock(>mutex);
 
+   if (move_group)
+   put_group(group_arr);
+
put_online_cpus();
 
event->owner = current;
@@ -7515,6 +7574,8 @@ SYSCALL_DEFINE5(perf_event_open,
return event_fd;
 
 err_context:
+   if (group_arr)
+   put_group(group_arr);
perf_unpin_context(ctx);
put_ctx(ctx);
 err_alloc:
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Use dwfl_report_elf() instead of offline

2015-01-15 Thread Jiri Olsa
On Wed, Jan 14, 2015 at 02:10:45PM -0800, Sukadev Bhattiprolu wrote:
> 
> From 8e6fb4c58d0d9f4798c191d840e32084b1217cc3 Mon Sep 17 00:00:00 2001
> From: Sukadev Bhattiprolu 
> Date: Fri, 21 Nov 2014 20:33:53 -0500
> Subject: [PATCH 1/1] Use dwfl_report_elf() instead of offline.
> 
> dwfl_report_offline() works only when libraries are prelinked.
> 
> Replace dwfl_report_offline() with dwfl_report_elf() so we correctly
> extract debug info even from libraries that are not prelinked.
> 
> Reported-by: Jiri Olsa 
> Signed-off-by: Sukadev Bhattiprolu 

Tested-by: Jiri Olsa 

thanks,
jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] arch/x86/kvm/vmx.c: Fix external interrupts inject directly bug with guestos RFLAGS.IF=0

2015-01-15 Thread Li Kaihang
Hello, please see the answer below blue:



From:   Radim Krčmář 
To: Li Kaihang ,
Cc: g...@kernel.org, pbonz...@redhat.com, t...@linutronix.de, 
mi...@redhat.com, h...@zytor.com, x...@kernel.org, k...@vger.kernel.org, 
linux-kernel@vger.kernel.org
Date:   2015-01-16 上午 02:09
Subject:Re: [PATCH 1/1] arch/x86/kvm/vmx.c: Fix external interrupts 
inject directly bug with guestos RFLAGS.IF=0



2015-01-15 20:36+0800, Li Kaihang:
> This patch fix a external interrupt injecting bug in linux 3.19-rc4.

Was the bug introduced in earlier 3.19 release candidate?

Li Kaihang: Yes, we also find this problem in 2.6.

> GuestOS is running and handling some interrupt with RFLAGS.IF = 0 while a 
> external interrupt coming,
> then can lead to a vm exit,in this case,we must avoid inject this external 
> interrupt or it will generate
> a processor hardware exception causing virtual machine crash.

What is the source of this exception?  (Is there a reproducer?)

Li Kaihang: exception is produced by intel processor hardware because injecting 
a external interrupt vector is forbidden by intel processor when GuestOS 
RFLAGS.IF = 0,
this need to be ensured by hypervisor software according to Intel 
64 and IA-32 Architectures Software Developer's Manual Volume 3.
This bug has a certain probability, if code is designed to be very 
short between cli and sti in a guestos's interrupt processing, probability of 
occurrence
is very low, this event is like moving trap, bug is produced that 
guestos is running between cli and sti instruction while a external interrupt 
coming, it
may be verified by constructing a special guestos interrupt code. 
General OS running on kvm vm has also probability to hit this bug.

> Now, I show more details about this problem:
>
> A general external interrupt processing for a running virtual machine is 
> shown in the following:
>
> Step 1:
>  a ext intr gen a vm_exit --> vmx_complete_interrupts --> 
> __vmx_complete_interrupts --> case INTR_TYPE_EXT_INR: 
> kvm_queue_interrupt(vcpu, vector, type == INTR_TYPE_SOFT_INTR);
>
> Step 2:
>  kvm_x86_ops->handle_external_intr(vcpu);
>
> Step 3:
>  get back to vcpu_enter_guest after a while cycle,then run 
> inject_pending_event
>
> Step 4:
>  if (vcpu->arch.interrupt.pending) {
>kvm_x86_ops->set_irq(vcpu);
>return 0;
>}
>
> Step 5:
>  kvm_x86_ops->run(vcpu) --> vm_entry inject vector to guestos IDT
>
> for the above steps, step 4 and 5 will be a processor hardware exception if 
> step1 happen while guestos RFLAGS.IF = 0, that is to say, guestos interrupt 
> is disabled.
> So we should add a logic to judge in step 1 whether a external interrupt need 
> to be pended then inject directly, in the process, we don't need to worry 
> about
> this external interrupt lost because the next Step 2 will handle and choose a 
> best chance to inject it by virtual interrupt controller.

Can you explain the relation between vectored events (Step 1) and
external interrupts (Step 2)?
(The bug happens when external interrupt arrives during event delivery?)

Li Kaihang: a external interrupt to running vm can trigger a vm_exit event 
handled in step 1, then this interrupt vector can be processed in step2
kvm_x86_ops->handle_external_intr(vcpu) and this function can jump 
to HOSTOS IDT to complete external interrupt handling,external interrupt 
handler in HOSTOS
IDT may inject the external interrupt into virtual interrupt 
controller if it has been registered to be needed by virtual machine.
The Bug has never happened in step 1 and 2, but 
vcpu->arch.interrupt.pending is set in step 1, if this pending should not be 
injected, it also will be passed
to step4 to complete the dangerous external interrupt injecting. 
Please see the above answer about what is "pending should not be injected"? Our 
solution
is that clearing invalid external interrupt pending to prevent 
error inject pass by adding a logical judge in step 1.

Why isn't the delivered event lost?
(It should be different from the external interrupt.)

Li Kaihang: please refer to the above answer, a external interrupt in step1 
only can get to case INTR_TYPE_EXT_INR branch in patch code, so it should not 
affect other
type events delivering, but there is another possibility that the 
external interrupt needed by running vm is not registered in hostos idt handler 
chain,of
course, this situation is another problem, even so it is dangerous 
action to inject the external interrupt directly if not judge current guestos 
RFLAGS.IF
state

Thanks.

>
>
> Signed-off-by: Li kaihang 
> ---
>  arch/x86/kvm/vmx.c |   20 ++--
>  1 files changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index d4c58d8..e8311ee 100644
> 

RE: [PATCH v8 9/11] xfstests: generic/043: Test multiple fallocate insert/collapse range calls

2015-01-15 Thread Namjae Jeon
> On Thu, Jan 15, 2015 at 07:14:26PM +0900, Namjae Jeon wrote:
> >
> > > > +_require_scratch
> > > > +_require_xfs_io_command "fiemap"
> > > > +_require_xfs_io_command "finsert"
> > > > +_require_xfs_io_command "fcollapse"
> > > > +_do_die_on_error=y
> > >
> > > What is _do_die_on_error for? Seems like that's only relevant for using
> > > _do()?
> > >
> > > > +src=$SCRATCH_MNT/testfile
> > > > +dest=$SCRATCH_MNT/testfile.dest
> > > > +BLOCKS=100
> > > > +BSIZE=`get_block_size $SCRATCH_MNT`
> > > > +
> > >
> > > rm -f $seqres.full
> > >
> > > ... to clear out the old .full file.
> > >
> > > > +_scratch_mkfs MKFS_OPTIONS >> $seqres.full 2>&1
> > >
> > > You don't need MKFS_OPTIONS here. In fact, this currently causes a mkfs
> > > failure (missing $) that we don't detect because we aren't checking that
> > > the mkfs actually succeeds. All we need to do here is:
> > >
> > > _scratch_mkfs >> $seqres.full 2>&1 || _fail "mkfs failed"
> > >
> > > If you dig down into _scratch_mkfs(), you'll see that it already
> > > includes $MKFS_OPTIONS and thus formats the fs as specified by the test
> > > config.
> > >
> > > It might also be a good idea to check that the _scratch_mount below
> > > succeeds as well...
> > >
> > > > +_scratch_mount >> $seqres.full 2>&1
> > > > +length=$(($BLOCKS * $BSIZE))
> > > > +
> > > > +# Write file
> > > > +$XFS_IO_PROG -f -c "pwrite 0 $length" -c fsync $src > /dev/null
> > > > +cp $src $dest
> > > > +
> > >
> > > It seems quite unlikely for this to not create a single extent given the
> > > smallish file size and freshly created fs, but who knows with various fs
> > > types, test configurations, test device sizes, etc. Another option could
> > > be to check the starting extent count and verify the ending extent count
> > > matches, rather than assume hardcoded values of 1.
> > >
> > > To be honest, even just including the starting extent count in the
> > > golden output (e.g., add an fiemap command here as well) might be good
> > > enough to distinguish that failure path from something going wrong in
> > > the collapse path, should it ever occur.
> > >
> > > > +# Insert alternate blocks
> > > > +for (( j=0; j < $(($BLOCKS/2)); j++ )); do
> > > > +   offset=$((($j*$BSIZE)*2))
> > > > +   $XFS_IO_PROG -c "finsert $offset $BSIZE" $dest > /dev/null
> > > > +done
> > > > +
> > > > +# Check if 100 extents are present
> > > > +$XFS_IO_PROG -c "fiemap -v" $dest | grep "^ *[0-9]*:" |wc -l
> > > > +
> > > > +_check_scratch_fs
> > > > +if [ $? -ne 0 ]; then
> > > > +   status=1
> > > > +   exit
> > > > +fi
> > > > +
> > > > +# Collapse alternate blocks
> > > > +for (( j=0; j < $(($BLOCKS/2)); j++ )); do
> > > > +   offset=$((($j*$BSIZE)))
> > > > +   $XFS_IO_PROG -c "fcollapse $offset $BSIZE" $dest > /dev/null
> > > > +done
> > > > +
> > > > +# Check if 1 extents are present
> > > > +$XFS_IO_PROG -c "fiemap -v" $dest | grep "^ *[0-9]*:" |wc -l
> > > > +
> > > > +# compare original file and test file.
> > > > +cmp $src $dest || _fail "file bytes check failed"
> > > > +
> > > > +_check_scratch_fs
> > > > +if [ $? -ne 0 ]; then
> > > > +   status=1
> > > > +   exit
> > > > +fi
> > > > +
> > > > +umount $SCRATCH_MNT
> > > > +
> > >
> > > The scratch device is unmounted and checked after each test that uses it
> > > so the above is unnecessary.
> > I checked your each review points. and updated patch.
> >
> > Could you please review below patch ?
> > Thanks a lot!
> >
> > --
> > Subject: [PATCH v9 9/11] xfstests: generic/043: Test multiple fallocate
> >  insert/collapse range calls
> >
> > This testcase(043) tries to test finsert range a single alternate block
> > mulitiple times and test merge code of collase range.
> >
> > Signed-off-by: Namjae Jeon 
> > Signed-off-by: Ashish Sangwan 
> > ---
> >  tests/generic/043 |   96 
> > +
> >  tests/generic/043.out |2 +
> >  tests/generic/group   |1 +
> >  3 files changed, 99 insertions(+), 0 deletions(-)
> >  create mode 100644 tests/generic/043
> >  create mode 100644 tests/generic/043.out
> >
> > diff --git a/tests/generic/043 b/tests/generic/043
> > new file mode 100644
> > index 000..a6f91ce
> > --- /dev/null
> > +++ b/tests/generic/043
> > @@ -0,0 +1,96 @@
> > +#! /bin/bash
> > +# FS QA Test No. generic/043
> > +#
> > +# Test multiple fallocate insert/collapse range calls on same file.
> > +# Call insert range a single alternate block multiple times until the file
> > +# is left with 100 extents and as much number of extents. And Call collapse
> > +# range about the previously inserted ranges to test merge code of collapse
> > +# range. Also check for data integrity and file system consistency.
> > +#---
> > +# Copyright (c) 2014 Samsung Electronics.  All Rights Reserved.
> > +#
> > +# This program is free software; 

RE: [PATCH v8 11/11] xfstests: fsx: Add fallocate insert range operation

2015-01-15 Thread Namjae Jeon

> >
> > > > }
> > > > do_collapse_range(offset, size);
> > > > break;
> > > > +   case OP_INSERT_RANGE:
> > > > +   TRIM_OFF(offset, file_size);
> > > > +   TRIM_LEN(offset, size, maxfilelen - file_size);
> > >
> > > Ugh, I hit a crash down in do_insert_range() due to the memset() on
> > > good_buf going off the end of the buffer. It looks like the TRIM_LEN()
> > > here is not correct and can result in a really large length value.
> > >
> > > Perhaps something like TRIM_LEN(file_size, length, maxfilelen) is what
> > > we want for insert range ops?
> > Hi Brian,
> > Oops, Sorry about that.
> > And your change is correct.
> >
> > Could you review the below updated patch ?
> >
> 
> This one looks good. Also, I ran overnight with the same change and it
> went for 23m ops. Not sure why it ended up stopping as the output files
> compare fine. I'll get it running on another box and see if I can
> repeat.
> 
> Reviewed-by: Brian Foster 
Thanks you. If you find any issue, Please share with me.

> 
> > Thanks!
> >
> > --
> > Subject: [PATCH v9 11/11] xfstests: fsx: Add fallocate insert range 
> > operation
> >
> > This commit adds fallocate FALLOC_FL_INSERT_RANGE support for fsx.
> >
> > Signed-off-by: Namjae Jeon 
> > Signed-off-by: Ashish Sangwan 
> > ---
> >  ltp/fsx.c |  124 
> > -
> >  1 files changed, 114 insertions(+), 10 deletions(-)
> >
> > diff --git a/ltp/fsx.c b/ltp/fsx.c
> > index 3709419..9fed5b2 100644
> > --- a/ltp/fsx.c
> > +++ b/ltp/fsx.c
> > @@ -95,7 +95,8 @@ int   logcount = 0;   /* total ops */
> >  #define OP_PUNCH_HOLE  6
> >  #define OP_ZERO_RANGE  7
> >  #define OP_COLLAPSE_RANGE  8
> > -#define OP_MAX_FULL9
> > +#define OP_INSERT_RANGE9
> > +#define OP_MAX_FULL10
> >
> >  /* operation modifiers */
> >  #define OP_CLOSEOPEN   100
> > @@ -145,6 +146,7 @@ int fallocate_calls = 1;/* -F flag 
> > disables */
> >  int punch_hole_calls = 1;   /* -H flag disables */
> >  int zero_range_calls = 1;   /* -z flag disables */
> >  intcollapse_range_calls = 1;   /* -C flag disables */
> > +intinsert_range_calls = 1; /* -i flag disables */
> >  intmapped_reads = 1;   /* -R flag disables it */
> >  intfsxgoodfd = 0;
> >  into_direct;   /* -Z */
> > @@ -339,6 +341,14 @@ logdump(void)
> >  lp->args[0] + lp->args[1])
> > prt("\t**");
> > break;
> > +   case OP_INSERT_RANGE:
> > +   prt("INSERT 0x%x thru 0x%x\t(0x%x bytes)",
> > +   lp->args[0], lp->args[0] + lp->args[1] - 1,
> > +   lp->args[1]);
> > +   if (badoff >= lp->args[0] && badoff <
> > +lp->args[0] + lp->args[1])
> > +   prt("\t**");
> > +   break;
> > case OP_SKIPPED:
> > prt("SKIPPED (no operation)");
> > break;
> > @@ -1012,6 +1022,59 @@ do_collapse_range(unsigned offset, unsigned length)
> >  }
> >  #endif
> >
> > +#ifdef FALLOC_FL_INSERT_RANGE
> > +void
> > +do_insert_range(unsigned offset, unsigned length)
> > +{
> > +   unsigned end_offset;
> > +   int mode = FALLOC_FL_INSERT_RANGE;
> > +
> > +   if (length == 0) {
> > +   if (!quiet && testcalls > simulatedopcount)
> > +   prt("skipping zero length insert range\n");
> > +   log4(OP_SKIPPED, OP_INSERT_RANGE, offset, length);
> > +   return;
> > +   }
> > +
> > +   if ((loff_t)offset >= file_size) {
> > +   if (!quiet && testcalls > simulatedopcount)
> > +   prt("skipping insert range behind EOF\n");
> > +   log4(OP_SKIPPED, OP_INSERT_RANGE, offset, length);
> > +   return;
> > +   }
> > +
> > +   log4(OP_INSERT_RANGE, offset, length, 0);
> > +
> > +   if (testcalls <= simulatedopcount)
> > +   return;
> > +
> > +   end_offset = offset + length;
> > +   if ((progressinterval && testcalls % progressinterval == 0) ||
> > +   (debug && (monitorstart == -1 || monitorend == -1 ||
> > + end_offset <= monitorend))) {
> > +   prt("%lu insert\tfrom 0x%x to 0x%x, (0x%x bytes)\n", testcalls,
> > +   offset, offset+length, length);
> > +   }
> > +   if (fallocate(fd, mode, (loff_t)offset, (loff_t)length) == -1) {
> > +   prt("insert range: %x to %x\n", offset, length);
> > +   prterr("do_insert_range: fallocate");
> > +   report_failure(161);
> > +   }
> > +
> > +   memmove(good_buf + end_offset, good_buf + 

[PATCH V3] mm/thp: Allocate transparent hugepages on local node

2015-01-15 Thread Aneesh Kumar K.V
This make sure that we try to allocate hugepages from local node if
allowed by mempolicy. If we can't, we fallback to small page allocation
based on mempolicy. This is based on the observation that allocating pages
on local node is more beneficial than allocating hugepages on remote node.

Signed-off-by: Aneesh Kumar K.V 
---
Changes from V2:
* Rebase to latest linus tree (cb59670870d90ff8bc31f5f2efc407c6fe4938c0)

 include/linux/gfp.h |  4 
 mm/huge_memory.c| 24 +---
 mm/mempolicy.c  | 40 
 3 files changed, 53 insertions(+), 15 deletions(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index b840e3b2770d..60110e06419d 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -335,11 +335,15 @@ alloc_pages(gfp_t gfp_mask, unsigned int order)
 extern struct page *alloc_pages_vma(gfp_t gfp_mask, int order,
struct vm_area_struct *vma, unsigned long addr,
int node);
+extern struct page *alloc_hugepage_vma(gfp_t gfp, struct vm_area_struct *vma,
+  unsigned long addr, int order);
 #else
 #define alloc_pages(gfp_mask, order) \
alloc_pages_node(numa_node_id(), gfp_mask, order)
 #define alloc_pages_vma(gfp_mask, order, vma, addr, node)  \
alloc_pages(gfp_mask, order)
+#define alloc_hugepage_vma(gfp_mask, vma, addr, order) \
+   alloc_pages(gfp_mask, order)
 #endif
 #define alloc_page(gfp_mask) alloc_pages(gfp_mask, 0)
 #define alloc_page_vma(gfp_mask, vma, addr)\
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 817a875f2b8c..031fb1584bbf 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -766,15 +766,6 @@ static inline gfp_t alloc_hugepage_gfpmask(int defrag, 
gfp_t extra_gfp)
return (GFP_TRANSHUGE & ~(defrag ? 0 : __GFP_WAIT)) | extra_gfp;
 }
 
-static inline struct page *alloc_hugepage_vma(int defrag,
- struct vm_area_struct *vma,
- unsigned long haddr, int nd,
- gfp_t extra_gfp)
-{
-   return alloc_pages_vma(alloc_hugepage_gfpmask(defrag, extra_gfp),
-  HPAGE_PMD_ORDER, vma, haddr, nd);
-}
-
 /* Caller must hold page table lock. */
 static bool set_huge_zero_page(pgtable_t pgtable, struct mm_struct *mm,
struct vm_area_struct *vma, unsigned long haddr, pmd_t *pmd,
@@ -795,6 +786,7 @@ int do_huge_pmd_anonymous_page(struct mm_struct *mm, struct 
vm_area_struct *vma,
   unsigned long address, pmd_t *pmd,
   unsigned int flags)
 {
+   gfp_t gfp;
struct page *page;
unsigned long haddr = address & HPAGE_PMD_MASK;
 
@@ -829,8 +821,8 @@ int do_huge_pmd_anonymous_page(struct mm_struct *mm, struct 
vm_area_struct *vma,
}
return 0;
}
-   page = alloc_hugepage_vma(transparent_hugepage_defrag(vma),
-   vma, haddr, numa_node_id(), 0);
+   gfp = alloc_hugepage_gfpmask(transparent_hugepage_defrag(vma), 0);
+   page = alloc_hugepage_vma(gfp, vma, haddr, HPAGE_PMD_ORDER);
if (unlikely(!page)) {
count_vm_event(THP_FAULT_FALLBACK);
return VM_FAULT_FALLBACK;
@@ -1118,10 +1110,12 @@ int do_huge_pmd_wp_page(struct mm_struct *mm, struct 
vm_area_struct *vma,
spin_unlock(ptl);
 alloc:
if (transparent_hugepage_enabled(vma) &&
-   !transparent_hugepage_debug_cow())
-   new_page = alloc_hugepage_vma(transparent_hugepage_defrag(vma),
- vma, haddr, numa_node_id(), 0);
-   else
+   !transparent_hugepage_debug_cow()) {
+   gfp_t gfp;
+
+   gfp = alloc_hugepage_gfpmask(transparent_hugepage_defrag(vma), 
0);
+   new_page = alloc_hugepage_vma(gfp, vma, haddr, HPAGE_PMD_ORDER);
+   } else
new_page = NULL;
 
if (unlikely(!new_page)) {
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 0e0961b8c39c..14604142c2c2 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -2030,6 +2030,46 @@ retry_cpuset:
return page;
 }
 
+struct page *alloc_hugepage_vma(gfp_t gfp, struct vm_area_struct *vma,
+   unsigned long addr, int order)
+{
+   struct page *page;
+   nodemask_t *nmask;
+   struct mempolicy *pol;
+   int node = numa_node_id();
+   unsigned int cpuset_mems_cookie;
+
+retry_cpuset:
+   pol = get_vma_policy(vma, addr);
+   cpuset_mems_cookie = read_mems_allowed_begin();
+
+   if (pol->mode != MPOL_INTERLEAVE) {
+   /*
+* For interleave policy, we don't worry about
+* current node. Otherwise if current node is
+* in nodemask, try to allocate hugepage from
+* 

RE: [PATCH v8 2/11] xfs: Add support FALLOC_FL_INSERT_RANGE for fallocate

2015-01-15 Thread Namjae Jeon
> On Wed, Jan 14, 2015 at 01:05:17AM +0900, Namjae Jeon wrote:
> > From: Namjae Jeon 
> >
> > This patch implements fallocate's FALLOC_FL_INSERT_RANGE for XFS.
> >
> > 1) Make sure that both offset and len are block size aligned.
> > 2) Update the i_size of inode by len bytes.
> > 3) Compute the file's logical block number against offset. If the computed
> >block number is not the starting block of the extent, split the extent
> >such that the block number is the starting block of the extent.
> > 4) Shift all the extents which are lying bewteen [offset, last allocated 
> > extent]
> >towards right by len bytes. This step will make a hole of len bytes
> >at offset.
> >
> > Signed-off-by: Namjae Jeon 
> > Signed-off-by: Ashish Sangwan 
> > Cc: Brian Foster
> > ---
> 
> Fixes look good (I assume nothing else changed between the few nits
> called out in v7) and survives overnight fsstress and fsx testing
> without any explosions:
Yes, I am also running test for long time, If there is any issue, will share.
Thanks for your review and help :)
> 
> Reviewed-by: Brian Foster 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 00/17] Introduce ACPI for ARM64 based on ACPI 5.1

2015-01-15 Thread Hanjun Guo

On 2015年01月16日 04:31, Mark Brown wrote:

On Thu, Jan 15, 2015 at 03:04:37PM -0500, Jason Cooper wrote:

On Thu, Jan 15, 2015 at 07:02:20PM +, Mark Brown wrote:



There's probably a bit of a process problem here - these patches are all
being posted as part of big and apparently controversial threads with
subject lines in the form "ARM / ACPI:" so people could be forgiven for
just not even reading the e-mails enough to notice changes to their
subsystems.  Is it worth posting those patches separately more directly
to the relevant maintainers?



I think it's beneficial to post the entire series as one thread, but to
change the subject line of each patch to adequately reflect the affected
subsystem.


Just changing the subject lines to be more suitable would help, but


OK, I will repost this patch set as you and Jason suggested soon.

Thanks
Hanjun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] ASoC: atmel_ssc_dai: remove clock pin comments

2015-01-15 Thread Bo Shen
As the clock can be get from TK/RK pin, so remove the comments.

Signed-off-by: Bo Shen 
---

 sound/soc/atmel/atmel_ssc_dai.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/sound/soc/atmel/atmel_ssc_dai.c b/sound/soc/atmel/atmel_ssc_dai.c
index e691aab..198661b 100644
--- a/sound/soc/atmel/atmel_ssc_dai.c
+++ b/sound/soc/atmel/atmel_ssc_dai.c
@@ -452,10 +452,6 @@ static int atmel_ssc_hw_params(struct snd_pcm_substream 
*substream,
case SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_CBM_CFM:
/*
 * I2S format, CODEC supplies BCLK and LRC clocks.
-*
-* The SSC transmit clock is obtained from the BCLK signal on
-* on the TK line, and the SSC receive clock is
-* generated from the transmit clock.
 */
rcmr =SSC_BF(RCMR_PERIOD, 0)
| SSC_BF(RCMR_STTDLY, START_DELAY)
-- 
2.3.0.rc0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] ASoC: atmel_ssc_dai: fix start event for I2S mode

2015-01-15 Thread Bo Shen
According to the I2S specification information as following:
  - WS = 0, channel 1 (left)
  - WS = 1, channel 2 (right)
So, the start event should be TF/RF falling edge.

Reported-by: Songjun Wu 
Signed-off-by: Bo Shen 
---

 sound/soc/atmel/atmel_ssc_dai.c | 18 --
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/sound/soc/atmel/atmel_ssc_dai.c b/sound/soc/atmel/atmel_ssc_dai.c
index 99ff35e..e691aab 100644
--- a/sound/soc/atmel/atmel_ssc_dai.c
+++ b/sound/soc/atmel/atmel_ssc_dai.c
@@ -348,7 +348,6 @@ static int atmel_ssc_hw_params(struct snd_pcm_substream 
*substream,
struct atmel_pcm_dma_params *dma_params;
int dir, channels, bits;
u32 tfmr, rfmr, tcmr, rcmr;
-   int start_event;
int ret;
int fslen, fslen_ext;
 
@@ -457,19 +456,10 @@ static int atmel_ssc_hw_params(struct snd_pcm_substream 
*substream,
 * The SSC transmit clock is obtained from the BCLK signal on
 * on the TK line, and the SSC receive clock is
 * generated from the transmit clock.
-*
-*  For single channel data, one sample is transferred
-* on the falling edge of the LRC clock.
-* For two channel data, one sample is
-* transferred on both edges of the LRC clock.
 */
-   start_event = ((channels == 1)
-   ? SSC_START_FALLING_RF
-   : SSC_START_EDGE_RF);
-
rcmr =SSC_BF(RCMR_PERIOD, 0)
| SSC_BF(RCMR_STTDLY, START_DELAY)
-   | SSC_BF(RCMR_START, start_event)
+   | SSC_BF(RCMR_START, SSC_START_FALLING_RF)
| SSC_BF(RCMR_CKI, SSC_CKI_RISING)
| SSC_BF(RCMR_CKO, SSC_CKO_NONE)
| SSC_BF(RCMR_CKS, ssc->clk_from_rk_pin ?
@@ -478,14 +468,14 @@ static int atmel_ssc_hw_params(struct snd_pcm_substream 
*substream,
rfmr =SSC_BF(RFMR_FSEDGE, SSC_FSEDGE_POSITIVE)
| SSC_BF(RFMR_FSOS, SSC_FSOS_NONE)
| SSC_BF(RFMR_FSLEN, 0)
-   | SSC_BF(RFMR_DATNB, 0)
+   | SSC_BF(RFMR_DATNB, (channels - 1))
| SSC_BIT(RFMR_MSBF)
| SSC_BF(RFMR_LOOP, 0)
| SSC_BF(RFMR_DATLEN, (bits - 1));
 
tcmr =SSC_BF(TCMR_PERIOD, 0)
| SSC_BF(TCMR_STTDLY, START_DELAY)
-   | SSC_BF(TCMR_START, start_event)
+   | SSC_BF(TCMR_START, SSC_START_FALLING_RF)
| SSC_BF(TCMR_CKI, SSC_CKI_FALLING)
| SSC_BF(TCMR_CKO, SSC_CKO_NONE)
| SSC_BF(TCMR_CKS, ssc->clk_from_rk_pin ?
@@ -495,7 +485,7 @@ static int atmel_ssc_hw_params(struct snd_pcm_substream 
*substream,
| SSC_BF(TFMR_FSDEN, 0)
| SSC_BF(TFMR_FSOS, SSC_FSOS_NONE)
| SSC_BF(TFMR_FSLEN, 0)
-   | SSC_BF(TFMR_DATNB, 0)
+   | SSC_BF(TFMR_DATNB, (channels - 1))
| SSC_BIT(TFMR_MSBF)
| SSC_BF(TFMR_DATDEF, 0)
| SSC_BF(TFMR_DATLEN, (bits - 1));
-- 
2.3.0.rc0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 00/17] Introduce ACPI for ARM64 based on ACPI 5.1

2015-01-15 Thread Hanjun Guo

On 2015年01月16日 02:23, Catalin Marinas wrote:

Hi Grant,

On Thu, Jan 15, 2015 at 04:26:20PM +, Grant Likely wrote:

On Wed, Jan 14, 2015 at 3:04 PM, Hanjun Guo  wrote:

This is the v7 of ACPI core patches for ARM64 based on ACPI 5.1


I'll get right to the point: Can we please have this series queued up
for v3.20?


Before you even ask for this, please look at the patches and realise
that there is a complete lack of Reviewed-by tags on the code (well,
apart from trivial Kconfig changes). In addition, the series touches on
other subsystems like clocksource, irqchip, acpi and I don't see any
acks from the corresponding maintainers. So even if I wanted to merge


For the ACPI part, Rafael already said that "Having looked at the
patches recently, I don't see any major problems in them from the ACPI
core perspective, so to me they are good to go." [1]
Is that kind of ack for this ?

Thanks
Hanjun

[1]:
http://lkml.iu.edu/hypermail/linux/kernel/1409.1/03363.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] virtio_rng: drop extra empty line

2015-01-15 Thread Herbert Xu
On Fri, Jan 16, 2015 at 09:16:00AM +0200, Michael S. Tsirkin wrote:
>
> So let's add this to maintainers?
> Will you ack something like below?

Sure I'll add that.

Thanks!
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: build failure after merge of the i2c tree

2015-01-15 Thread Pantelis Antoniou
Hi Stephen,

> On Jan 16, 2015, at 04:22 , Stephen Rothwell  wrote:
> 
> Hi Wolfram,
> 
> After merging the i2c tree, today's linux-next build (x86_64
> allmodconfig) failed like this:
> 
> drivers/i2c/i2c-core.c: In function 'i2c_unregister_device':
> drivers/i2c/i2c-core.c:1016:3: error: implicit declaration of function 
> 'of_node_clear_flag' [-Werror=implicit-function-declaration]
>   of_node_clear_flag(client->dev.of_node, OF_POPULATED);
>   ^
> drivers/i2c/i2c-core.c:1016:43: error: 'OF_POPULATED' undeclared (first use 
> in this function)
>   of_node_clear_flag(client->dev.of_node, OF_POPULATED);
>   ^
> 
> Caused by commit d5285c36e6d2 ("i2c: Mark instantiated device nodes
> with OF_POPULATE").
> 
> I have used the version of the i2c tree from next-20150115 for today.

A patch that fixes it has already been posted.

> -- 
> Cheers,
> Stephen Rothwells...@canb.auug.org.au
> 

Regards

— Pantelis

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] virtio_rng: drop extra empty line

2015-01-15 Thread Michael S. Tsirkin
On Fri, Jan 16, 2015 at 10:21:09AM +1100, Herbert Xu wrote:
> On Thu, Jan 15, 2015 at 01:50:42PM +0200, Michael S. Tsirkin wrote:
> > makes code look a bit prettier.
> > 
> > Signed-off-by: Michael S. Tsirkin 
> 
> Please resend this patch with a cc to linux-cry...@vger.kernel.org.
> 
> Thanks!

So let's add this to maintainers?
Will you ack something like below?

--->


MAINTAINERS: add linux-crypto to hw random

hw random is crypto-related, Cc the linux-crypto list
on patches.

Signed-off-by: Michael S. Tsirkin 


diff --git a/MAINTAINERS b/MAINTAINERS
index 3589d67..4d54f2e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4403,6 +4403,7 @@ F:include/linux/hwmon*.h
 HARDWARE RANDOM NUMBER GENERATOR CORE
 M: Matt Mackall 
 M: Herbert Xu 
+L: linux-cry...@vger.kernel.org
 S: Odd fixes
 F: Documentation/hw_random.txt
 F: drivers/char/hw_random/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm/vmscan: fix highidx argument type

2015-01-15 Thread Michael S. Tsirkin
On Thu, Jan 15, 2015 at 02:49:20PM -0800, Andrew Morton wrote:
> On Fri, 16 Jan 2015 00:18:12 +0200 "Michael S. Tsirkin"  
> wrote:
> 
> > for_each_zone_zonelist_nodemask wants an enum zone_type
> > argument, but is passed gfp_t:
> > 
> > mm/vmscan.c:2658:9:expected int enum zone_type [signed] highest_zoneidx
> > mm/vmscan.c:2658:9:got restricted gfp_t [usertype] gfp_mask
> > mm/vmscan.c:2658:9: warning: incorrect type in argument 2 (different base 
> > types)
> > mm/vmscan.c:2658:9:expected int enum zone_type [signed] highest_zoneidx
> > mm/vmscan.c:2658:9:got restricted gfp_t [usertype] gfp_mask
> 
> Which tool emitted these warnings?

Oh, sorry.
It's sparce.

> > convert argument to the correct type.
> > 
> > ...
> >
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -2656,7 +2656,7 @@ static bool throttle_direct_reclaim(gfp_t gfp_mask, 
> > struct zonelist *zonelist,
> >  * should make reasonable progress.
> >  */
> > for_each_zone_zonelist_nodemask(zone, z, zonelist,
> > -   gfp_mask, nodemask) {
> > +   gfp_zone(gfp_mask), nodemask) {
> > if (zone_idx(zone) > ZONE_NORMAL)
> > continue;
> 
> hm, I wonder what the runtime effects are.
> 
> The throttle_direct_reclaim() comment isn't really accurate, is it? 
> "Throttle direct reclaimers if backing storage is backed by the
> network".  The code is applicable to all types of backing, but was
> added to address problems which are mainly observed with network
> backing?


As far as I can tell, yes. It would seem that it can cause
deadlocks in theory.  Cc stable on the grounds that it's obvious?

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [vireshk:cpufreq/core/locking 57/60] drivers/cpufreq/cpufreq.c:512:5: sparse: symbol '__cpufreq_boost_supported' was not declared. Should it be static?

2015-01-15 Thread Viresh Kumar
On 16 January 2015 at 12:25, kbuild test robot  wrote:
> tree:   https://git.linaro.org/people/vireshk/linux cpufreq/core/locking
> head:   12d5339e685739289c2f629c943b8bfad4c64f1e
> commit: fbce49b78afdaeeaee3017cfbd968e44ddba8496 [57/60] cpufreq: Drop 
> forward declaration of __cpufreq_boost_trigger_state()
> reproduce:
>   # apt-get install sparse
>   git checkout fbce49b78afdaeeaee3017cfbd968e44ddba8496
>   make ARCH=x86_64 allmodconfig
>   make C=1 CF=-D__CHECK_ENDIAN__
>
>
> sparse warnings: (new ones prefixed by >>)
>
>>> drivers/cpufreq/cpufreq.c:512:5: sparse: symbol '__cpufreq_boost_supported' 
>>> was not declared. Should it be static?
>
> Please review and possibly fold the followup patch.

Fixed. Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[vireshk:cpufreq/core/locking 57/60] drivers/cpufreq/cpufreq.c:512:5: sparse: symbol '__cpufreq_boost_supported' was not declared. Should it be static?

2015-01-15 Thread kbuild test robot
tree:   https://git.linaro.org/people/vireshk/linux cpufreq/core/locking
head:   12d5339e685739289c2f629c943b8bfad4c64f1e
commit: fbce49b78afdaeeaee3017cfbd968e44ddba8496 [57/60] cpufreq: Drop forward 
declaration of __cpufreq_boost_trigger_state()
reproduce:
  # apt-get install sparse
  git checkout fbce49b78afdaeeaee3017cfbd968e44ddba8496
  make ARCH=x86_64 allmodconfig
  make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)

>> drivers/cpufreq/cpufreq.c:512:5: sparse: symbol '__cpufreq_boost_supported' 
>> was not declared. Should it be static?

Please review and possibly fold the followup patch.

---
0-DAY kernel test infrastructureOpen Source Technology Center
http://lists.01.org/mailman/listinfo/kbuild Intel Corporation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH vireshk] cpufreq: __cpufreq_boost_supported() can be static

2015-01-15 Thread kbuild test robot
drivers/cpufreq/cpufreq.c:512:5: sparse: symbol '__cpufreq_boost_supported' was 
not declared. Should it be static?

Signed-off-by: Fengguang Wu 
---
 cpufreq.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index c151f4b..4e3d42f 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -509,7 +509,7 @@ unlock:
return ret;
 }
 
-int __cpufreq_boost_supported(void)
+static int __cpufreq_boost_supported(void)
 {
if (likely(cpufreq_driver))
return cpufreq_driver->boost_supported;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [RFC PATCH] fs: file freeze support

2015-01-15 Thread Namjae Jeon

> 
>   Hello,
Hi Jan,
> 

> > +
> > +int file_write_unfreeze(struct inode *inode)
> > +{
> > +   struct super_block *sb = inode->i_sb;
> > +
> > +   if (!S_ISREG(inode->i_mode))
> > +   return -EINVAL;
> > +
> > +   spin_lock(>i_lock);
> > +
> > +   if (!(inode->i_state & I_WRITE_FREEZED)) {
> > +   spin_unlock(>i_lock);
> > +   return -EINVAL;
> > +   }
> > +
> > +   inode->i_state &= ~I_WRITE_FREEZED;
> > +   smp_wmb();
> > +   wake_up(>s_writers.wait_unfrozen);
> > +   spin_unlock(>i_lock);
> > +   return 0;
> > +}
> > +EXPORT_SYMBOL(file_write_unfreeze);
>   So I was looking at the implementation and I have a few comments:
> 1) The trick with freezing superblock looks nice but I'm somewhat worried
> that if we wanted to heavily use per-inode freezing to defrag the whole
> filesystem it may be too slow to freeze the whole fs, mark one inode as
> frozen and then unfreeze the fs. But I guess we'll see that once have some
> reasonably working implementation.
Dmitry has given a good idea to avoid multiple freeze fs and unfreeze fs
calls.

ioctl(sb,FIFREEZE)
while (f = pop(files_list))
  ioctl(f,FS_IOC_FWFREEZE)
ioctl(sb,FITHAW)

In file_write_freeze, we could first check if the fs is already frozen,
if yes than we can directly set inode write freeze state after taking
relevant lock to prevent fs_thaw while the inode state is being set.

> 
> 2) The tests you are currently doing are racy. If
> things happen as:
>   CPU1CPU2
> inode_start_write()
>   file_write_freeze()
> sb_start_pagefault()
> Do modifications.
> 
> Then you have a CPU modifying a file while file_write_freeze() has
> succeeded so it should be frozen.
> 
> If you swap inode_start_write() with sb_start_pagefault() the above race
> doesn't happen but userspace program has to be really careful not to hit a
> deadlock. E.g. if you tried to freeze two inodes the following could happen:
>   CPU1CPU2
>   file_write_freeze(inode1)
> fault on inode1:
> sb_start_pagefault()
> inode_start_write() -> blocks
>   file_write_freeze(inode2)
> blocks in freeze_super()
> 
> So I don't think this is a good scheme for inode freezing...
To solve this race, we can fold inode_start_write with sb_start_write and use
similar appraoch of __sb_start_write.
How about the below scheme ?

void inode_start_write(struct inode *inode)
{
struct super_block *sb = inode->i_sb;

retry:

if (unlikely(inode->i_state & I_WRITE_FREEZED)) {
DEFINE_WAIT(wait);

prepare_to_wait(>s_writers.wait_unfrozen, ,
TASK_UNINTERRUPTIBLE);
schedule();
finish_wait(>s_writers.wait_unfrozen, );

goto retry;
}

sb_start_write(sb);

/* check if file_write_freeze race with us */
if (unlikely(inode->i_state & I_WRITE_FREEZED) {
sb_end_write(sb);
goto retry;
}
}

Thanks for your review!
> 
>   Honza
> --
> Jan Kara 
> SUSE Labs, CR

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/1] serial: Fintek F81232 driver improvement

2015-01-15 Thread Peter Hung
From: Peter Hong 

The original driver completed with TX function, but RX/MSR/MCR/LSR is not
workable with this driver. So we rewrite it to make this device workable.

This patch is tested with PassMark BurnInTest with Cycle-to-115200 +
MCR/MSR check for 15mins & checked with Suspend-To-RAM/DISK

Signed-off-by: Peter Hong 
---
 drivers/usb/serial/f81232.c | 528 
 1 file changed, 440 insertions(+), 88 deletions(-)

diff --git a/drivers/usb/serial/f81232.c b/drivers/usb/serial/f81232.c
index c5dc233..5ae6bc9 100644
--- a/drivers/usb/serial/f81232.c
+++ b/drivers/usb/serial/f81232.c
@@ -23,9 +23,16 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+
+#define FINTEK_MAGIC 'F'
+#define FINTEK_GET_ID  _IOR(FINTEK_MAGIC, 3, int)
+#define FINTEK_VENDOR_ID   0x1934
+#define FINTEK_DEVICE_ID   0x0706  /* RS232 1 port */
 
 static const struct usb_device_id id_table[] = {
-   { USB_DEVICE(0x1934, 0x0706) },
+   { USB_DEVICE(FINTEK_VENDOR_ID, FINTEK_DEVICE_ID) },
{ } /* Terminating entry */
 };
 MODULE_DEVICE_TABLE(usb, id_table);
@@ -37,30 +44,257 @@ MODULE_DEVICE_TABLE(usb, id_table);
 #define UART_STATE_TRANSIENT_MASK  0x74
 #define UART_DCD   0x01
 #define UART_DSR   0x02
-#define UART_BREAK_ERROR   0x04
 #define UART_RING  0x08
-#define UART_FRAME_ERROR   0x10
-#define UART_PARITY_ERROR  0x20
-#define UART_OVERRUN_ERROR 0x40
 #define UART_CTS   0x80
 
+
+#define UART_BREAK_ERROR   0x10
+#define UART_FRAME_ERROR   0x08
+#define UART_PARITY_ERROR  0x04
+#define UART_OVERRUN_ERROR 0x02
+
+
+#define  SERIAL_EVEN_PARITY (UART_LCR_PARITY | UART_LCR_EPAR)
+
+
+#define REGISTER_REQUEST 0xA0
+#define F81232_USB_TIMEOUT 1000
+#define F81232_USB_RETRY 20
+
+
+#define SERIAL_BASE_ADDRESS   ((__u16)0x0120)
+#define RECEIVE_BUFFER_REGISTER((__u16)(0x00) + SERIAL_BASE_ADDRESS)
+#define TRANSMIT_HOLDING_REGISTER  ((__u16)(0x00) + SERIAL_BASE_ADDRESS)
+#define INTERRUPT_ENABLE_REGISTER  ((__u16)(0x01) + SERIAL_BASE_ADDRESS)
+#define INTERRUPT_IDENT_REGISTER   ((__u16)(0x02) + SERIAL_BASE_ADDRESS)
+#define FIFO_CONTROL_REGISTER  ((__u16)(0x02) + SERIAL_BASE_ADDRESS)
+#define LINE_CONTROL_REGISTER  ((__u16)(0x03) + SERIAL_BASE_ADDRESS)
+#define MODEM_CONTROL_REGISTER ((__u16)(0x04) + SERIAL_BASE_ADDRESS)
+#define LINE_STATUS_REGISTER   ((__u16)(0x05) + SERIAL_BASE_ADDRESS)
+#define MODEM_STATUS_REGISTER  ((__u16)(0x06) + SERIAL_BASE_ADDRESS)
+
+static int m_enable_debug;
+
+module_param(m_enable_debug, int, S_IRUGO);
+MODULE_PARM_DESC(m_enable_debug, "Debugging mode enabled or not");
+
+#define LOG_MESSAGE(x, y, ...) \
+   printk(x  y, ##__VA_ARGS__)
+
+#define LOG_DEBUG_MESSAGE(level, y, ...)   \
+   do { if (unlikely(m_enable_debug))  \
+   printk(level  y, ##__VA_ARGS__); } while (0)
+
+
 struct f81232_private {
spinlock_t lock;
-   u8 line_control;
-   u8 line_status;
+   u8 modem_control;
+   u8 modem_status;
+   struct usb_device *dev;
+
+   struct work_struct int_worker;
+   struct usb_serial_port *port;
 };
 
-static void f81232_update_line_status(struct usb_serial_port *port,
- unsigned char *data,
- unsigned int actual_length)
+
+static inline int calc_baud_divisor(u32 baudrate)
 {
-   /*
-* FIXME: Update port->icount, and call
-*
-*  wake_up_interruptible(>port.delta_msr_wait);
-*
-*on MSR changes.
-*/
+   u32 divisor, rem;
+
+   divisor = 115200L / baudrate;
+   rem = 115200L % baudrate;
+
+   /* Round to nearest divisor */
+   if (((rem * 2) >= baudrate) && (baudrate != 110))
+   divisor++;
+
+   return divisor;
+}
+
+
+static inline int f81232_get_register(struct usb_device *dev,
+   u16 reg, u8 *data)
+{
+   int status;
+   int i = 0;
+timeout_get_repeat:
+
+   status = usb_control_msg(dev,
+usb_rcvctrlpipe(dev, 0),
+REGISTER_REQUEST,
+0xc0,
+reg,
+0,
+data,
+sizeof(*data),
+F81232_USB_TIMEOUT);
+   if (status < 0) {
+   i++;
+
+   if (i < F81232_USB_RETRY) {
+   mdelay(1);
+   goto timeout_get_repeat;
+   }
+   }
+   return status;
+}
+
+
+static inline int f81232_set_register(struct usb_device *dev,
+   u16 reg, u8 data)
+{
+   int status;
+   int i = 0;
+
+timeout_set_repeat:
+   status = 0;
+
+   status = usb_control_msg(dev,
+   

[PATCH v18 02/12] input: cyapa: add gen5 trackpad device basic functions support

2015-01-15 Thread Dudley Du
Based on the cyapa core, add the gen5 trackpad device's basic functions
supported, so gen5 trackpad device can work with kernel input system.
And also based on the state parse interface, the cyapa driver can
automatically determine the attached is gen3 or gen5 protocol trackpad
device, then set the correct protocol to work with the attached
trackpad device.
TEST=test on Chromebooks.

Signed-off-by: Dudley Du 
---
 drivers/input/mouse/Makefile |2 +-
 drivers/input/mouse/cyapa.c  |   13 +
 drivers/input/mouse/cyapa.h  |1 +
 drivers/input/mouse/cyapa_gen5.c | 1678 ++
 4 files changed, 1693 insertions(+), 1 deletion(-)
 create mode 100644 drivers/input/mouse/cyapa_gen5.c

diff --git a/drivers/input/mouse/Makefile b/drivers/input/mouse/Makefile
index 8bd950d..8a9c98e 100644
--- a/drivers/input/mouse/Makefile
+++ b/drivers/input/mouse/Makefile
@@ -24,7 +24,7 @@ obj-$(CONFIG_MOUSE_SYNAPTICS_I2C) += synaptics_i2c.o
 obj-$(CONFIG_MOUSE_SYNAPTICS_USB)  += synaptics_usb.o
 obj-$(CONFIG_MOUSE_VSXXXAA)+= vsxxxaa.o
 
-cyapatp-objs := cyapa.o cyapa_gen3.o
+cyapatp-objs := cyapa.o cyapa_gen3.o cyapa_gen5.o
 psmouse-objs := psmouse-base.o synaptics.o focaltech.o
 
 psmouse-$(CONFIG_MOUSE_PS2_ALPS)   += alps.o
diff --git a/drivers/input/mouse/cyapa.c b/drivers/input/mouse/cyapa.c
index 36c6433..fa4d501 100644
--- a/drivers/input/mouse/cyapa.c
+++ b/drivers/input/mouse/cyapa.c
@@ -184,6 +184,14 @@ static int cyapa_get_state(struct cyapa *cyapa)
if (!error)
goto out_detected;
}
+   if ((cyapa->gen == CYAPA_GEN_UNKNOWN ||
+   cyapa->gen == CYAPA_GEN5) &&
+   !smbus && even_addr) {
+   error = cyapa_gen5_ops.state_parse(cyapa,
+   status, BL_STATUS_SIZE);
+   if (!error)
+   goto out_detected;
+   }
 
/*
 * Write 0x00 0x00 to trackpad device to force update its
@@ -272,6 +280,9 @@ static int cyapa_check_is_operational(struct cyapa *cyapa)
return error;
 
switch (cyapa->gen) {
+   case CYAPA_GEN5:
+   cyapa->ops = _gen5_ops;
+   break;
case CYAPA_GEN3:
cyapa->ops = _gen3_ops;
break;
@@ -506,6 +517,8 @@ static int cyapa_initialize(struct cyapa *cyapa)
 
/* ops.initialize() is aimed to prepare for module communications. */
error = cyapa_gen3_ops.initialize(cyapa);
+   if (!error)
+   error = cyapa_gen5_ops.initialize(cyapa);
if (error)
return error;
 
diff --git a/drivers/input/mouse/cyapa.h b/drivers/input/mouse/cyapa.h
index aab19b7..481a60d 100644
--- a/drivers/input/mouse/cyapa.h
+++ b/drivers/input/mouse/cyapa.h
@@ -292,5 +292,6 @@ u16 cyapa_pwr_cmd_to_sleep_time(u8 pwr_mode);
 
 extern const char product_id[];
 extern const struct cyapa_dev_ops cyapa_gen3_ops;
+extern const struct cyapa_dev_ops cyapa_gen5_ops;
 
 #endif
diff --git a/drivers/input/mouse/cyapa_gen5.c b/drivers/input/mouse/cyapa_gen5.c
new file mode 100644
index 000..a049ae3
--- /dev/null
+++ b/drivers/input/mouse/cyapa_gen5.c
@@ -0,0 +1,1678 @@
+/*
+ * Cypress APA trackpad with I2C interface
+ *
+ * Author: Dudley Du 
+ *
+ * Copyright (C) 2014 Cypress Semiconductor, Inc.
+ *
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file COPYING in the main directory of this archive for
+ * more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "cyapa.h"
+
+
+/* Macro of Gen5 */
+#define RECORD_EVENT_NONE0
+#define RECORD_EVENT_TOUCHDOWN  1
+#define RECORD_EVENT_DISPLACE2
+#define RECORD_EVENT_LIFTOFF 3
+
+#define CYAPA_TSG_FLASH_MAP_BLOCK_SIZE  0x80
+#define CYAPA_TSG_IMG_FW_HDR_SIZE   13
+#define CYAPA_TSG_FW_ROW_SIZE   (CYAPA_TSG_FLASH_MAP_BLOCK_SIZE)
+#define CYAPA_TSG_IMG_START_ROW_NUM 0x002e
+#define CYAPA_TSG_IMG_END_ROW_NUM   0x01fe
+#define CYAPA_TSG_IMG_APP_INTEGRITY_ROW_NUM 0x01ff
+#define CYAPA_TSG_IMG_MAX_RECORDS   (CYAPA_TSG_IMG_END_ROW_NUM - \
+   CYAPA_TSG_IMG_START_ROW_NUM + 1 + 1)
+#define CYAPA_TSG_IMG_READ_SIZE (CYAPA_TSG_FLASH_MAP_BLOCK_SIZE / 
2)
+#define CYAPA_TSG_START_OF_APPLICATION  0x1700
+#define CYAPA_TSG_APP_INTEGRITY_SIZE60
+#define CYAPA_TSG_FLASH_MAP_METADATA_SIZE   60
+#define CYAPA_TSG_BL_KEY_SIZE   8
+
+#define CYAPA_TSG_MAX_CMD_SIZE  256
+
+#define GEN5_BL_CMD_VERIFY_APP_INTEGRITY0x31
+#define GEN5_BL_CMD_GET_BL_INFO0x38
+#define GEN5_BL_CMD_PROGRAM_VERIFY_ROW  0x39
+#define GEN5_BL_CMD_LAUNCH_APP 0x3b
+#define GEN5_BL_CMD_INITIATE_BL  

[PATCH v18 01/12] input: cyapa: re-design driver to support multi-trackpad in one driver

2015-01-15 Thread Dudley Du
In order to support multiple different chipsets and communication protocols
trackpad devices in one cyapa driver, the new cyapa driver is re-designed
with one cyapa driver core and multiple device specific functions component.
The cyapa driver core is contained in this patch, it supplies basic functions
that working with kernel and input subsystem, and also supplies the interfaces
that the specific devices' component can connect and work together with as
one driver.
TEST=test on Chromebooks.

Signed-off-by: Dudley Du 
---
 drivers/input/mouse/Kconfig  |   10 -
 drivers/input/mouse/Makefile |3 +-
 drivers/input/mouse/cyapa.c  | 1076 +++---
 drivers/input/mouse/cyapa.h  |  296 +++
 drivers/input/mouse/cyapa_gen3.c |  807 
 5 files changed, 1516 insertions(+), 676 deletions(-)
 create mode 100644 drivers/input/mouse/cyapa.h
 create mode 100644 drivers/input/mouse/cyapa_gen3.c

diff --git a/drivers/input/mouse/Kconfig b/drivers/input/mouse/Kconfig
index 2541bfa..d8b46b0 100644
--- a/drivers/input/mouse/Kconfig
+++ b/drivers/input/mouse/Kconfig
@@ -146,16 +146,6 @@ config MOUSE_PS2_OLPC
 
  If unsure, say N.
 
-config MOUSE_PS2_FOCALTECH
-   bool "FocalTech PS/2 mouse protocol extension" if EXPERT
-   default y
-   depends on MOUSE_PS2
-   help
- Say Y here if you have a FocalTech PS/2 TouchPad connected to
- your system.
-
- If unsure, say Y.
-
 config MOUSE_SERIAL
tristate "Serial mouse"
select SERIO
diff --git a/drivers/input/mouse/Makefile b/drivers/input/mouse/Makefile
index 560003d..8bd950d 100644
--- a/drivers/input/mouse/Makefile
+++ b/drivers/input/mouse/Makefile
@@ -8,7 +8,7 @@ obj-$(CONFIG_MOUSE_AMIGA)   += amimouse.o
 obj-$(CONFIG_MOUSE_APPLETOUCH) += appletouch.o
 obj-$(CONFIG_MOUSE_ATARI)  += atarimouse.o
 obj-$(CONFIG_MOUSE_BCM5974)+= bcm5974.o
-obj-$(CONFIG_MOUSE_CYAPA)  += cyapa.o
+obj-$(CONFIG_MOUSE_CYAPA)  += cyapatp.o
 obj-$(CONFIG_MOUSE_ELAN_I2C)   += elan_i2c.o
 obj-$(CONFIG_MOUSE_GPIO)   += gpio_mouse.o
 obj-$(CONFIG_MOUSE_INPORT) += inport.o
@@ -24,6 +24,7 @@ obj-$(CONFIG_MOUSE_SYNAPTICS_I2C) += synaptics_i2c.o
 obj-$(CONFIG_MOUSE_SYNAPTICS_USB)  += synaptics_usb.o
 obj-$(CONFIG_MOUSE_VSXXXAA)+= vsxxxaa.o
 
+cyapatp-objs := cyapa.o cyapa_gen3.o
 psmouse-objs := psmouse-base.o synaptics.o focaltech.o
 
 psmouse-$(CONFIG_MOUSE_PS2_ALPS)   += alps.o
diff --git a/drivers/input/mouse/cyapa.c b/drivers/input/mouse/cyapa.c
index 1bece8c..36c6433 100644
--- a/drivers/input/mouse/cyapa.c
+++ b/drivers/input/mouse/cyapa.c
@@ -20,408 +20,127 @@
 #include 
 #include 
 #include 
+#include 
 #include 
+#include 
+#include "cyapa.h"
 
-/* APA trackpad firmware generation */
-#define CYAPA_GEN3   0x03   /* support MT-protocol B with tracking ID. */
-
-#define CYAPA_NAME   "Cypress APA Trackpad (cyapa)"
-
-/* commands for read/write registers of Cypress trackpad */
-#define CYAPA_CMD_SOFT_RESET   0x00
-#define CYAPA_CMD_POWER_MODE   0x01
-#define CYAPA_CMD_DEV_STATUS   0x02
-#define CYAPA_CMD_GROUP_DATA   0x03
-#define CYAPA_CMD_GROUP_CMD0x04
-#define CYAPA_CMD_GROUP_QUERY  0x05
-#define CYAPA_CMD_BL_STATUS0x06
-#define CYAPA_CMD_BL_HEAD  0x07
-#define CYAPA_CMD_BL_CMD   0x08
-#define CYAPA_CMD_BL_DATA  0x09
-#define CYAPA_CMD_BL_ALL   0x0a
-#define CYAPA_CMD_BLK_PRODUCT_ID   0x0b
-#define CYAPA_CMD_BLK_HEAD 0x0c
-
-/* report data start reg offset address. */
-#define DATA_REG_START_OFFSET  0x
-
-#define BL_HEAD_OFFSET 0x00
-#define BL_DATA_OFFSET 0x10
-
-/*
- * Operational Device Status Register
- *
- * bit 7: Valid interrupt source
- * bit 6 - 4: Reserved
- * bit 3 - 2: Power status
- * bit 1 - 0: Device status
- */
-#define REG_OP_STATUS 0x00
-#define OP_STATUS_SRC 0x80
-#define OP_STATUS_POWER   0x0c
-#define OP_STATUS_DEV 0x03
-#define OP_STATUS_MASK (OP_STATUS_SRC | OP_STATUS_POWER | OP_STATUS_DEV)
-
-/*
- * Operational Finger Count/Button Flags Register
- *
- * bit 7 - 4: Number of touched finger
- * bit 3: Valid data
- * bit 2: Middle Physical Button
- * bit 1: Right Physical Button
- * bit 0: Left physical Button
- */
-#define REG_OP_DATA1   0x01
-#define OP_DATA_VALID  0x08
-#define OP_DATA_MIDDLE_BTN 0x04
-#define OP_DATA_RIGHT_BTN  0x02
-#define OP_DATA_LEFT_BTN   0x01
-#define OP_DATA_BTN_MASK (OP_DATA_MIDDLE_BTN | OP_DATA_RIGHT_BTN | \
- OP_DATA_LEFT_BTN)
-
-/*
- * Bootloader Status Register
- *
- * bit 7: Busy
- * bit 6 - 5: Reserved
- * bit 4: Bootloader running
- * bit 3 - 1: Reserved
- * bit 0: Checksum valid
- */
-#define REG_BL_STATUS0x01
-#define BL_STATUS_BUSY   0x80
-#define BL_STATUS_RUNNING0x10
-#define BL_STATUS_DATA_VALID 0x08
-#define BL_STATUS_CSUM_VALID 0x01
-
-/*
- 

[PATCH v18 06/12] input: cyapa: add gen3 trackpad device firmware update function support

2015-01-15 Thread Dudley Du
Add firmware image update function supported for gen3 trackpad device,
it can be used through sysfs update_fw interface.
TEST=test on Chromebooks.

Signed-off-by: Dudley Du 
---
 drivers/input/mouse/cyapa_gen3.c | 309 +++
 1 file changed, 309 insertions(+)

diff --git a/drivers/input/mouse/cyapa_gen3.c b/drivers/input/mouse/cyapa_gen3.c
index 1b62c7d..214d45a 100644
--- a/drivers/input/mouse/cyapa_gen3.c
+++ b/drivers/input/mouse/cyapa_gen3.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "cyapa.h"
 
 
@@ -115,6 +116,18 @@ struct cyapa_reg_data {
struct cyapa_touch touches[5];
 } __packed;
 
+struct gen3_write_block_cmd {
+   u8 checksum_seed;  /* Always be 0xff */
+   u8 cmd_code;   /* command code: 0x39 */
+   u8 key[8]; /* 8-byte security key */
+   __be16 block_num;
+   u8 block_data[CYAPA_FW_BLOCK_SIZE];
+   u8 block_checksum;  /* Calculated using bytes 12 - 75 */
+   u8 cmd_checksum;/* Calculated using bytes 0-76 */
+} __packed;
+
+static const u8 security_key[] = {
+   0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07 };
 static const u8 bl_activate[] = { 0x00, 0xff, 0x38, 0x00, 0x01, 0x02, 0x03,
0x04, 0x05, 0x06, 0x07 };
 static const u8 bl_deactivate[] = { 0x00, 0xff, 0x3b, 0x00, 0x01, 0x02, 0x03,
@@ -423,6 +436,87 @@ static int cyapa_gen3_state_parse(struct cyapa *cyapa, u8 
*reg_data, int len)
return -EAGAIN;
 }
 
+/*
+ * Enter bootloader by soft resetting the device.
+ *
+ * If device is already in the bootloader, the function just returns.
+ * Otherwise, reset the device; after reset, device enters bootloader idle
+ * state immediately.
+ *
+ * Returns:
+ *   0on success
+ *   -EAGAIN  device was reset, but is not now in bootloader idle state
+ *   < 0  if the device never responds within the timeout
+ */
+static int cyapa_gen3_bl_enter(struct cyapa *cyapa)
+{
+   int error;
+   int waiting_time;
+
+   error = cyapa_poll_state(cyapa, 500);
+   if (error)
+   return error;
+   if (cyapa->state == CYAPA_STATE_BL_IDLE) {
+   /* Already in BL_IDLE. Skipping reset. */
+   return 0;
+   }
+
+   if (cyapa->state != CYAPA_STATE_OP)
+   return -EAGAIN;
+
+   cyapa->operational = false;
+   cyapa->state = CYAPA_STATE_NO_DEVICE;
+   error = cyapa_write_byte(cyapa, CYAPA_CMD_SOFT_RESET, 0x01);
+   if (error)
+   return -EIO;
+
+   usleep_range(25000, 5);
+   waiting_time = 2000;  /* For some shipset, max waiting time is 1~2s. */
+   do {
+   error = cyapa_poll_state(cyapa, 500);
+   if (error) {
+   if (error == -ETIMEDOUT) {
+   waiting_time -= 500;
+   continue;
+   }
+   return error;
+   }
+
+   if ((cyapa->state == CYAPA_STATE_BL_IDLE) &&
+   !(cyapa->status[REG_BL_STATUS] & BL_STATUS_WATCHDOG))
+   break;
+
+   msleep(100);
+   waiting_time -= 100;
+   } while (waiting_time > 0);
+
+   if ((cyapa->state != CYAPA_STATE_BL_IDLE) ||
+   (cyapa->status[REG_BL_STATUS] & BL_STATUS_WATCHDOG))
+   return -EAGAIN;
+
+   return 0;
+}
+
+static int cyapa_gen3_bl_activate(struct cyapa *cyapa)
+{
+   int error;
+
+   error = cyapa_i2c_reg_write_block(cyapa, 0, sizeof(bl_activate),
+   bl_activate);
+   if (error)
+   return error;
+
+   /* Wait for bootloader to activate; takes between 2 and 12 seconds */
+   msleep(2000);
+   error = cyapa_poll_state(cyapa, 11000);
+   if (error)
+   return error;
+   if (cyapa->state != CYAPA_STATE_BL_ACTIVE)
+   return -EAGAIN;
+
+   return 0;
+}
+
 static int cyapa_gen3_bl_deactivate(struct cyapa *cyapa)
 {
int error;
@@ -483,6 +577,212 @@ static int cyapa_gen3_bl_exit(struct cyapa *cyapa)
return 0;
 }
 
+static u16 cyapa_gen3_csum(const u8 *buf, size_t count)
+{
+   int i;
+   u16 csum = 0;
+
+   for (i = 0; i < count; i++)
+   csum += buf[i];
+
+   return csum;
+}
+
+/*
+ * Verify the integrity of a CYAPA firmware image file.
+ *
+ * The firmware image file is 30848 bytes, composed of 482 64-byte blocks.
+ *
+ * The first 2 blocks are the firmware header.
+ * The next 480 blocks are the firmware image.
+ *
+ * The first two bytes of the header hold the header checksum, computed by
+ * summing the other 126 bytes of the header.
+ * The last two bytes of the header hold the firmware image checksum, computed
+ * by summing the 30720 bytes of the image modulo 0x.
+ *
+ * Both checksums are stored little-endian.
+ */
+static int cyapa_gen3_check_fw(struct cyapa *cyapa, const struct firmware 

[PATCH v18 12/12] input: cyapa: add acpi device id support

2015-01-15 Thread Dudley Du
Add acpi device tree support.
acpi device id "CYAP" is for old gen3 trackpad devices.
acpi device id "CYAP0001" is for new gen5 trackpad devices.
TEST=test on Chromebooks.

Signed-off-by: Dudley Du 
---
 drivers/input/mouse/cyapa.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/drivers/input/mouse/cyapa.c b/drivers/input/mouse/cyapa.c
index 5f9b24a..9dc9f65 100644
--- a/drivers/input/mouse/cyapa.c
+++ b/drivers/input/mouse/cyapa.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "cyapa.h"
 
 
@@ -1356,11 +1357,23 @@ static const struct i2c_device_id cyapa_id_table[] = {
 };
 MODULE_DEVICE_TABLE(i2c, cyapa_id_table);
 
+#ifdef CONFIG_ACPI
+static const struct acpi_device_id cyapa_acpi_id[] = {
+   { "CYAP", 0 },  /* Gen3 trackpad with 0x67 I2C address. */
+   { "CYAP0001", 0 },  /* Gen5 trackpad with 0x24 I2C address. */
+   { }
+};
+MODULE_DEVICE_TABLE(acpi, cyapa_acpi_id);
+#endif
+
 static struct i2c_driver cyapa_driver = {
.driver = {
.name = "cyapa",
.owner = THIS_MODULE,
.pm = _pm_ops,
+#ifdef CONFIG_ACPI
+   .acpi_match_table = ACPI_PTR(cyapa_acpi_id),
+#endif
},
 
.probe = cyapa_probe,
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v18 05/12] input: cyapa: add sysfs interfaces support in the cyapa driver

2015-01-15 Thread Dudley Du
Add device's basic control and features supported in cyapa driver through
sysfs file system interfaces. These interfaces are commonly used in
pre- and after production, for trackpad device state checking, managing
and firmware image updating.
These interfaces including mode, firmware_version and product_id interfaces
for reading firmware version and trackpad device product id values,
and including update_fw interface to command firmware image update
process. Also including baseline and calibrate interfaces for
reading and checking trackpad device's sensors states.
TEST=test on Chromebooks.

Signed-off-by: Dudley Du 
---
 drivers/input/mouse/cyapa.c | 310 
 1 file changed, 310 insertions(+)

diff --git a/drivers/input/mouse/cyapa.c b/drivers/input/mouse/cyapa.c
index 43fa0cb..5f9b24a 100644
--- a/drivers/input/mouse/cyapa.c
+++ b/drivers/input/mouse/cyapa.c
@@ -32,6 +32,8 @@
 #define CYAPA_ADAPTER_FUNC_SMBUS  2
 #define CYAPA_ADAPTER_FUNC_BOTH   3
 
+#define CYAPA_FW_NAME  "cyapa.bin"
+
 const char product_id[] = "CYTRA";
 
 static int cyapa_reinitialize(struct cyapa *cyapa);
@@ -476,6 +478,38 @@ static int cyapa_create_input_dev(struct cyapa *cyapa)
return 0;
 }
 
+static void cyapa_enable_irq_for_cmd(struct cyapa *cyapa)
+{
+   struct input_dev *input = cyapa->input;
+
+   if (!input || !input->users) {
+   /*
+* When input is NULL, TP must be in deep sleep mode.
+* In this mode, later non-power I2C command will always failed
+* if not bring it out of deep sleep mode firstly,
+* so must command TP to active mode here.
+*/
+   if (!input || cyapa->operational)
+   cyapa->ops->set_power_mode(cyapa,
+   PWR_MODE_FULL_ACTIVE, 0);
+   /* Gen3 always using polling mode for command. */
+   if (cyapa->gen >= CYAPA_GEN5)
+   enable_irq(cyapa->client->irq);
+   }
+}
+
+static void cyapa_disable_irq_for_cmd(struct cyapa *cyapa)
+{
+   struct input_dev *input = cyapa->input;
+
+   if (!input || !input->users) {
+   if (cyapa->gen >= CYAPA_GEN5)
+   disable_irq(cyapa->client->irq);
+   if (!input || cyapa->operational)
+   cyapa->ops->set_power_mode(cyapa, PWR_MODE_OFF, 0);
+   }
+}
+
 /*
  * cyapa_sleep_time_to_pwr_cmd and cyapa_pwr_cmd_to_sleep_time
  *
@@ -859,6 +893,269 @@ static int cyapa_start_runtime(struct cyapa *cyapa)
 static inline int cyapa_start_runtime(struct cyapa *cyapa) { return 0; }
 #endif /* CONFIG_PM_RUNTIME */
 
+static ssize_t cyapa_show_fm_ver(struct device *dev,
+struct device_attribute *attr, char *buf)
+{
+   int error;
+   struct cyapa *cyapa = dev_get_drvdata(dev);
+
+   error = mutex_lock_interruptible(>state_sync_lock);
+   if (error)
+   return error;
+   error = scnprintf(buf, PAGE_SIZE, "%d.%d\n", cyapa->fw_maj_ver,
+cyapa->fw_min_ver);
+   mutex_unlock(>state_sync_lock);
+   return error;
+}
+
+static ssize_t cyapa_show_product_id(struct device *dev,
+struct device_attribute *attr, char *buf)
+{
+   struct cyapa *cyapa = dev_get_drvdata(dev);
+   int size;
+   int error;
+
+   error = mutex_lock_interruptible(>state_sync_lock);
+   if (error)
+   return error;
+   size = scnprintf(buf, PAGE_SIZE, "%s\n", cyapa->product_id);
+   mutex_unlock(>state_sync_lock);
+   return size;
+}
+
+static int cyapa_firmware(struct cyapa *cyapa, const char *fw_name)
+{
+   struct device *dev = >client->dev;
+   const struct firmware *fw;
+   int error;
+
+   error = request_firmware(, fw_name, dev);
+   if (error) {
+   dev_err(dev, "Could not load firmware from %s: %d\n",
+   fw_name, error);
+   return error;
+   }
+
+   error = cyapa->ops->check_fw(cyapa, fw);
+   if (error) {
+   dev_err(dev, "Invalid CYAPA firmware image: %s\n",
+   fw_name);
+   goto done;
+   }
+
+   /*
+* Resume the potentially suspended device because doing FW
+* update on a device not in the FULL mode has a chance to
+* fail.
+*/
+   pm_runtime_get_sync(dev);
+
+   /* Require IRQ support for firmware update commands. */
+   cyapa_enable_irq_for_cmd(cyapa);
+
+   error = cyapa->ops->bl_enter(cyapa);
+   if (error) {
+   dev_err(dev, "bl_enter failed, %d\n", error);
+   goto err_detect;
+   }
+
+   error = cyapa->ops->bl_activate(cyapa);
+   if (error) {
+   dev_err(dev, "bl_activate failed, %d\n", error);
+   goto err_detect;
+   }
+
+   error = 

[PATCH v18 03/12] input: cyapa: add power management interfaces support for the device

2015-01-15 Thread Dudley Du
Add suspend_scanrate_ms power management interfaces in device's
power group, so users or applications can control the power management
strategy of trackpad device as their requirements.
TEST=test on Chromebooks.

Signed-off-by: Dudley Du 
---
 drivers/input/mouse/cyapa.c | 127 
 1 file changed, 127 insertions(+)

diff --git a/drivers/input/mouse/cyapa.c b/drivers/input/mouse/cyapa.c
index fa4d501..eb0e515 100644
--- a/drivers/input/mouse/cyapa.c
+++ b/drivers/input/mouse/cyapa.c
@@ -604,6 +604,127 @@ out:
return IRQ_HANDLED;
 }
 
+/*
+ **
+ * sysfs interface
+ **
+*/
+#ifdef CONFIG_PM_SLEEP
+static ssize_t cyapa_show_suspend_scanrate(struct device *dev,
+  struct device_attribute *attr,
+  char *buf)
+{
+   struct cyapa *cyapa = dev_get_drvdata(dev);
+   u8 pwr_cmd = cyapa->suspend_power_mode;
+   u16 sleep_time;
+   int len;
+   int error;
+
+   error = mutex_lock_interruptible(>state_sync_lock);
+   if (error)
+   return error;
+   pwr_cmd = cyapa->suspend_power_mode;
+   sleep_time = cyapa->suspend_sleep_time;
+   mutex_unlock(>state_sync_lock);
+
+   if (pwr_cmd == PWR_MODE_BTN_ONLY) {
+   len = scnprintf(buf, PAGE_SIZE, "%s\n", BTN_ONLY_MODE_NAME);
+   } else if (pwr_cmd == PWR_MODE_OFF) {
+   len = scnprintf(buf, PAGE_SIZE, "%s\n", OFF_MODE_NAME);
+   } else {
+   if (cyapa->gen == CYAPA_GEN3)
+   sleep_time = cyapa_pwr_cmd_to_sleep_time(pwr_cmd);
+   len = scnprintf(buf, PAGE_SIZE, "%u\n", sleep_time);
+   }
+
+   return len;
+}
+
+static ssize_t cyapa_update_suspend_scanrate(struct device *dev,
+struct device_attribute *attr,
+const char *buf, size_t count)
+{
+   struct cyapa *cyapa = dev_get_drvdata(dev);
+   u16 sleep_time;
+   int error;
+
+   error = mutex_lock_interruptible(>state_sync_lock);
+   if (error)
+   return error;
+
+   if (sysfs_streq(buf, BTN_ONLY_MODE_NAME)) {
+   cyapa->suspend_power_mode = PWR_MODE_BTN_ONLY;
+   } else if (sysfs_streq(buf, OFF_MODE_NAME)) {
+   cyapa->suspend_power_mode = PWR_MODE_OFF;
+   } else if (!kstrtou16(buf, 10, _time)) {
+   cyapa->suspend_sleep_time = max_t(u16, sleep_time, 1000);
+   cyapa->suspend_power_mode =
+   cyapa_sleep_time_to_pwr_cmd(cyapa->suspend_sleep_time);
+   } else {
+   count = 0;
+   }
+
+   mutex_unlock(>state_sync_lock);
+
+   return count ? count : -EINVAL;
+}
+
+static DEVICE_ATTR(suspend_scanrate_ms, S_IRUGO|S_IWUSR,
+  cyapa_show_suspend_scanrate,
+  cyapa_update_suspend_scanrate);
+
+static struct attribute *cyapa_power_wakeup_entries[] = {
+   _attr_suspend_scanrate_ms.attr,
+   NULL,
+};
+
+static const struct attribute_group cyapa_power_wakeup_group = {
+   .name = power_group_name,
+   .attrs = cyapa_power_wakeup_entries,
+};
+
+static void cyapa_remove_power_wakeup_group(void *data)
+{
+   struct cyapa *cyapa = data;
+
+   sysfs_unmerge_group(>client->dev.kobj,
+   _power_wakeup_group);
+}
+
+static int cyapa_prepare_wakeup_controls(struct cyapa *cyapa)
+{
+   struct i2c_client *client = cyapa->client;
+   struct device *dev = >dev;
+   int error;
+
+   if (device_can_wakeup(dev)) {
+   error = sysfs_merge_group(>dev.kobj,
+   _power_wakeup_group);
+   if (error) {
+   dev_err(dev, "failed to add power wakeup group: %d\n",
+   error);
+   return error;
+   }
+
+   error = devm_add_action(dev,
+   cyapa_remove_power_wakeup_group, cyapa);
+   if (error) {
+   cyapa_remove_power_wakeup_group(cyapa);
+   dev_err(dev, "failed to add power cleanup action: %d\n",
+   error);
+   return error;
+   }
+   }
+
+   return 0;
+}
+#else
+static inline int cyapa_prepare_wakeup_controls(struct cyapa *cyapa)
+{
+   return 0;
+}
+#endif /* CONFIG_PM_SLEEP */
+
 static int cyapa_probe(struct i2c_client *client,
   const struct i2c_device_id *dev_id)
 {
@@ -643,6 +764,12 @@ static int cyapa_probe(struct i2c_client *client,
return error;
}
 
+   error = cyapa_prepare_wakeup_controls(cyapa);
+   if (error) {
+   dev_err(dev, "failed to prepare wakeup controls: %d\n", 

[PATCH v18 07/12] input: cyapa: add gen3 trackpad device read baseline function support

2015-01-15 Thread Dudley Du
Add read baseline function supported for gen3 trackpad device,
it can be used through sysfs baseline interface.
TEST=test on Chromebooks.

Signed-off-by: Dudley Du 
---
 drivers/input/mouse/cyapa_gen3.c | 72 
 1 file changed, 72 insertions(+)

diff --git a/drivers/input/mouse/cyapa_gen3.c b/drivers/input/mouse/cyapa_gen3.c
index 214d45a..6e04e4b 100644
--- a/drivers/input/mouse/cyapa_gen3.c
+++ b/drivers/input/mouse/cyapa_gen3.c
@@ -783,6 +783,76 @@ static int cyapa_gen3_do_fw_update(struct cyapa *cyapa,
return 0;
 }
 
+static ssize_t cyapa_gen3_show_baseline(struct device *dev,
+  struct device_attribute *attr, char *buf)
+{
+   struct cyapa *cyapa = dev_get_drvdata(dev);
+   int max_baseline, min_baseline;
+   int tries;
+   int ret;
+
+   ret = cyapa_read_byte(cyapa, CYAPA_CMD_DEV_STATUS);
+   if (ret < 0) {
+   dev_err(dev, "Error reading dev status. err = %d\n", ret);
+   goto out;
+   }
+   if ((ret & CYAPA_DEV_NORMAL) != CYAPA_DEV_NORMAL) {
+   dev_warn(dev, "Trackpad device is busy. device state = 0x%x\n",
+ret);
+   ret = -EAGAIN;
+   goto out;
+   }
+
+   ret = cyapa_write_byte(cyapa, CYAPA_CMD_SOFT_RESET,
+  OP_REPORT_BASELINE_MASK);
+   if (ret < 0) {
+   dev_err(dev, "Failed to send report baseline command. %d\n",
+   ret);
+   goto out;
+   }
+
+   tries = 3;  /* Try for 30 to 60 ms */
+   do {
+   usleep_range(1, 2);
+
+   ret = cyapa_read_byte(cyapa, CYAPA_CMD_DEV_STATUS);
+   if (ret < 0) {
+   dev_err(dev, "Error reading dev status. err = %d\n",
+   ret);
+   goto out;
+   }
+   if ((ret & CYAPA_DEV_NORMAL) == CYAPA_DEV_NORMAL)
+   break;
+   } while (--tries);
+
+   if (tries == 0) {
+   dev_err(dev, "Device timed out going to Normal state.\n");
+   ret = -ETIMEDOUT;
+   goto out;
+   }
+
+   ret = cyapa_read_byte(cyapa, CYAPA_CMD_MAX_BASELINE);
+   if (ret < 0) {
+   dev_err(dev, "Failed to read max baseline. err = %d\n", ret);
+   goto out;
+   }
+   max_baseline = ret;
+
+   ret = cyapa_read_byte(cyapa, CYAPA_CMD_MIN_BASELINE);
+   if (ret < 0) {
+   dev_err(dev, "Failed to read min baseline. err = %d\n", ret);
+   goto out;
+   }
+   min_baseline = ret;
+
+   dev_dbg(dev, "Baseline report successful. Max: %d Min: %d\n",
+   max_baseline, min_baseline);
+   ret = scnprintf(buf, PAGE_SIZE, "%d %d\n", max_baseline, min_baseline);
+
+out:
+   return ret;
+}
+
 /*
  * cyapa_get_wait_time_for_pwr_cmd
  *
@@ -1104,6 +1174,8 @@ const struct cyapa_dev_ops cyapa_gen3_ops = {
.bl_deactivate = cyapa_gen3_bl_deactivate,
.bl_initiate = cyapa_gen3_bl_initiate,
 
+   .show_baseline = cyapa_gen3_show_baseline,
+
.initialize = cyapa_gen3_initialize,
 
.state_parse = cyapa_gen3_state_parse,
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v18 08/12] input: cyapa: add gen3 trackpad device force re-calibrate function support

2015-01-15 Thread Dudley Du
Add force re-calibrate function supported for gen3 trackpad device,
it can be used through sysfs calibrate interface.
TEST=test on Chromebooks.

Signed-off-by: Dudley Du 
---
 drivers/input/mouse/cyapa_gen3.c | 59 
 1 file changed, 59 insertions(+)

diff --git a/drivers/input/mouse/cyapa_gen3.c b/drivers/input/mouse/cyapa_gen3.c
index 6e04e4b..77e9d70 100644
--- a/drivers/input/mouse/cyapa_gen3.c
+++ b/drivers/input/mouse/cyapa_gen3.c
@@ -783,6 +783,64 @@ static int cyapa_gen3_do_fw_update(struct cyapa *cyapa,
return 0;
 }
 
+static ssize_t cyapa_gen3_do_calibrate(struct device *dev,
+struct device_attribute *attr,
+const char *buf, size_t count)
+{
+   struct cyapa *cyapa = dev_get_drvdata(dev);
+   int tries;
+   int ret;
+
+   ret = cyapa_read_byte(cyapa, CYAPA_CMD_DEV_STATUS);
+   if (ret < 0) {
+   dev_err(dev, "Error reading dev status: %d\n", ret);
+   goto out;
+   }
+   if ((ret & CYAPA_DEV_NORMAL) != CYAPA_DEV_NORMAL) {
+   dev_warn(dev, "Trackpad device is busy, device state: 0x%02x\n",
+ret);
+   ret = -EAGAIN;
+   goto out;
+   }
+
+   ret = cyapa_write_byte(cyapa, CYAPA_CMD_SOFT_RESET,
+  OP_RECALIBRATION_MASK);
+   if (ret < 0) {
+   dev_err(dev, "Failed to send calibrate command: %d\n",
+   ret);
+   goto out;
+   }
+
+   tries = 20;  /* max recalibration timeout 2s. */
+   do {
+   /*
+* For this recalibration, the max time will not exceed 2s.
+* The average time is approximately 500 - 700 ms, and we
+* will check the status every 100 - 200ms.
+*/
+   usleep_range(10, 20);
+
+   ret = cyapa_read_byte(cyapa, CYAPA_CMD_DEV_STATUS);
+   if (ret < 0) {
+   dev_err(dev, "Error reading dev status: %d\n",
+   ret);
+   goto out;
+   }
+   if ((ret & CYAPA_DEV_NORMAL) == CYAPA_DEV_NORMAL)
+   break;
+   } while (--tries);
+
+   if (tries == 0) {
+   dev_err(dev, "Failed to calibrate. Timeout.\n");
+   ret = -ETIMEDOUT;
+   goto out;
+   }
+   dev_dbg(dev, "Calibration successful.\n");
+
+out:
+   return ret < 0 ? ret : count;
+}
+
 static ssize_t cyapa_gen3_show_baseline(struct device *dev,
   struct device_attribute *attr, char *buf)
 {
@@ -1175,6 +1233,7 @@ const struct cyapa_dev_ops cyapa_gen3_ops = {
.bl_initiate = cyapa_gen3_bl_initiate,
 
.show_baseline = cyapa_gen3_show_baseline,
+   .calibrate_store = cyapa_gen3_do_calibrate,
 
.initialize = cyapa_gen3_initialize,
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v18 10/12] input: cyapa: add gen5 trackpad device read baseline function support

2015-01-15 Thread Dudley Du
Add read baseline function supported for gen5 trackpad device,
it can be used through sysfs baseline interface.
TEST=test on Chromebooks.

Signed-off-by: Dudley Du 
---
 drivers/input/mouse/cyapa.h  |   2 +
 drivers/input/mouse/cyapa_gen5.c | 640 +++
 2 files changed, 642 insertions(+)

diff --git a/drivers/input/mouse/cyapa.h b/drivers/input/mouse/cyapa.h
index f5a2c22..fb3b863 100644
--- a/drivers/input/mouse/cyapa.h
+++ b/drivers/input/mouse/cyapa.h
@@ -265,6 +265,8 @@ struct cyapa {
u8 y_origin;  /* Y Axis Origin: 0 = top; 1 = bottom. */
int electrodes_x;  /* Number of electrodes on the X Axis*/
int electrodes_y;  /* Number of electrodes on the Y Axis*/
+   int electrodes_rx;  /* Number of Rx electrodes */
+   int aligned_electrodes_rx;  /* 4 aligned */
int max_z;
 
/*
diff --git a/drivers/input/mouse/cyapa_gen5.c b/drivers/input/mouse/cyapa_gen5.c
index 442b29d..7c6e339 100644
--- a/drivers/input/mouse/cyapa_gen5.c
+++ b/drivers/input/mouse/cyapa_gen5.c
@@ -363,6 +363,12 @@ struct gen5_app_get_parameter_data {
u8 parameter_id;
 } __packed;
 
+struct gen5_retrieve_panel_scan_data {
+   __le16 read_offset;
+   __le16 read_elements;
+   u8 data_id;
+} __packed;
+
 /* Variables to record latest gen5 trackpad power states. */
 #define GEN5_DEV_SET_PWR_STATE(cyapa, s)   ((cyapa)->dev_pwr_mode = (s))
 #define GEN5_DEV_GET_PWR_STATE(cyapa)  ((cyapa)->dev_pwr_mode)
@@ -1661,6 +1667,638 @@ static int cyapa_gen5_set_power_mode(struct cyapa 
*cyapa,
return 0;
 }
 
+static int cyapa_gen5_resume_scanning(struct cyapa *cyapa)
+{
+   u8 cmd[] = { 0x04, 0x00, 0x05, 0x00, 0x2f, 0x00, 0x04 };
+   u8 resp_data[6];
+   int resp_len;
+   int error;
+
+   /* Try to dump all buffered data before doing command. */
+   cyapa_empty_pip_output_data(cyapa, NULL, NULL, NULL);
+
+   resp_len = sizeof(resp_data);
+   error = cyapa_i2c_pip_cmd_irq_sync(cyapa,
+   cmd, sizeof(cmd),
+   resp_data, _len,
+   500, cyapa_gen5_sort_tsg_pip_app_resp_data, true);
+   if (error || !VALID_CMD_RESP_HEADER(resp_data, 0x04))
+   return -EINVAL;
+
+   /* Try to dump all buffered data when resuming scanning. */
+   cyapa_empty_pip_output_data(cyapa, NULL, NULL, NULL);
+
+   return 0;
+}
+
+static int cyapa_gen5_suspend_scanning(struct cyapa *cyapa)
+{
+   u8 cmd[] = { 0x04, 0x00, 0x05, 0x00, 0x2f, 0x00, 0x03 };
+   u8 resp_data[6];
+   int resp_len;
+   int error;
+
+   /* Try to dump all buffered data before doing command. */
+   cyapa_empty_pip_output_data(cyapa, NULL, NULL, NULL);
+
+   resp_len = sizeof(resp_data);
+   error = cyapa_i2c_pip_cmd_irq_sync(cyapa,
+   cmd, sizeof(cmd),
+   resp_data, _len,
+   500, cyapa_gen5_sort_tsg_pip_app_resp_data, true);
+   if (error || !VALID_CMD_RESP_HEADER(resp_data, 0x03))
+   return -EINVAL;
+
+   /* Try to dump all buffered data when suspending scanning. */
+   cyapa_empty_pip_output_data(cyapa, NULL, NULL, NULL);
+
+   return 0;
+}
+
+static s32 twos_complement_to_s32(s32 value, int num_bits)
+{
+   if (value >> (num_bits - 1))
+   value |=  -1 << num_bits;
+   return value;
+}
+
+static s32 cyapa_parse_structure_data(u8 data_format, u8 *buf, int buf_len)
+{
+   int data_size;
+   bool big_endian;
+   bool unsigned_type;
+   s32 value;
+
+   data_size = (data_format & 0x07);
+   big_endian = ((data_format & 0x10) == 0x00);
+   unsigned_type = ((data_format & 0x20) == 0x00);
+
+   if (buf_len < data_size)
+   return 0;
+
+   switch (data_size) {
+   case 1:
+   value  = buf[0];
+   break;
+   case 2:
+   if (big_endian)
+   value = get_unaligned_be16(buf);
+   else
+   value = get_unaligned_le16(buf);
+   break;
+   case 4:
+   if (big_endian)
+   value = get_unaligned_be32(buf);
+   else
+   value = get_unaligned_le32(buf);
+   break;
+   default:
+   /* Should not happen, just as default case here. */
+   value = 0;
+   break;
+   }
+
+   if (!unsigned_type)
+   value = twos_complement_to_s32(value, data_size * 8);
+
+   return value;
+}
+
+static void cyapa_gen5_guess_electrodes(struct cyapa *cyapa,
+   int *electrodes_rx, int *electrodes_tx)
+{
+   if (cyapa->electrodes_rx != 0) {
+   *electrodes_rx = cyapa->electrodes_rx;
+   *electrodes_tx = (cyapa->electrodes_x == *electrodes_rx) ?
+   cyapa->electrodes_y : cyapa->electrodes_x;
+   } else {
+

[PATCH v18 09/12] input: cyapa: add gen5 trackpad device firmware update function support

2015-01-15 Thread Dudley Du
Add firmware image update function supported for gen5 trackpad device,
it can be used through sysfs update_fw interface.
TEST=test on Chromebooks.

Signed-off-by: Dudley Du 
---
 drivers/input/mouse/Kconfig  |   1 +
 drivers/input/mouse/cyapa_gen5.c | 391 +++
 2 files changed, 392 insertions(+)

diff --git a/drivers/input/mouse/Kconfig b/drivers/input/mouse/Kconfig
index d8b46b0..728490e 100644
--- a/drivers/input/mouse/Kconfig
+++ b/drivers/input/mouse/Kconfig
@@ -206,6 +206,7 @@ config MOUSE_BCM5974
 config MOUSE_CYAPA
tristate "Cypress APA I2C Trackpad support"
depends on I2C
+   select CRC_ITU_T
help
  This driver adds support for Cypress All Points Addressable (APA)
  I2C Trackpads, including the ones used in 2012 Samsung Chromebooks.
diff --git a/drivers/input/mouse/cyapa_gen5.c b/drivers/input/mouse/cyapa_gen5.c
index a049ae3..442b29d 100644
--- a/drivers/input/mouse/cyapa_gen5.c
+++ b/drivers/input/mouse/cyapa_gen5.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "cyapa.h"
 
 
@@ -264,6 +265,79 @@ struct cyapa_gen5_report_data {
struct cyapa_gen5_touch_record touch_records[10];
 } __packed;
 
+struct cyapa_tsg_bin_image_head {
+   u8 head_size;  /* Unit: bytes, including itself. */
+   u8 ttda_driver_major_version;  /* Reserved as 0. */
+   u8 ttda_driver_minor_version;  /* Reserved as 0. */
+   u8 fw_major_version;
+   u8 fw_minor_version;
+   u8 fw_revision_control_number[8];
+} __packed;
+
+struct cyapa_tsg_bin_image_data_record {
+   u8 flash_array_id;
+   __be16 row_number;
+   /* The number of bytes of flash data contained in this record. */
+   __be16 record_len;
+   /* The flash program data. */
+   u8 record_data[CYAPA_TSG_FW_ROW_SIZE];
+} __packed;
+
+struct cyapa_tsg_bin_image {
+   struct cyapa_tsg_bin_image_head image_head;
+   struct cyapa_tsg_bin_image_data_record records[0];
+} __packed;
+
+struct gen5_bl_packet_start {
+   u8 sop;  /* Start of packet, must be 01h */
+   u8 cmd_code;
+   __le16 data_length;  /* Size of data parameter start from data[0] */
+} __packed;
+
+struct gen5_bl_packet_end {
+   __le16 crc;
+   u8 eop;  /* End of packet, must be 17h */
+} __packed;
+
+struct gen5_bl_cmd_head {
+   __le16 addr;   /* Output report register address, must be 0004h */
+   /* Size of packet not including output report register address */
+   __le16 length;
+   u8 report_id;  /* Bootloader output report id, must be 40h */
+   u8 rsvd;  /* Reserved, must be 0 */
+   struct gen5_bl_packet_start packet_start;
+   u8 data[0];  /* Command data variable based on commands */
+} __packed;
+
+/* Initiate bootload command data structure. */
+struct gen5_bl_initiate_cmd_data {
+   /* Key must be "A5h 01h 02h 03h FFh FEh FDh 5Ah" */
+   u8 key[CYAPA_TSG_BL_KEY_SIZE];
+   u8 metadata_raw_parameter[CYAPA_TSG_FLASH_MAP_METADATA_SIZE];
+   __le16 metadata_crc;
+} __packed;
+
+struct gen5_bl_metadata_row_params {
+   __le16 size;
+   __le16 maximun_size;
+   __le32 app_start;
+   __le16 app_len;
+   __le16 app_crc;
+   __le32 app_entry;
+   __le32 upgrade_start;
+   __le16 upgrade_len;
+   __le16 entry_row_crc;
+   u8 padding[36];  /* Padding data must be 0 */
+   __le16 metadata_crc;  /* CRC starts at offset of 60 */
+} __packed;
+
+/* Bootload program and verify row command data structure */
+struct gen5_bl_flash_row_head {
+   u8 flash_array_id;
+   __le16 flash_row_id;
+   u8 flash_data[0];
+} __packed;
+
 struct gen5_app_cmd_head {
__le16 addr;   /* Output report register address, must be 0004h */
/* Size of packet not including output report register address */
@@ -297,6 +371,10 @@ struct gen5_app_get_parameter_data {
 #define GEN5_DEV_UNINIT_SLEEP_TIME(cyapa)  \
(((cyapa)->dev_sleep_time) == UNINIT_SLEEP_TIME)
 
+
+static u8 cyapa_gen5_bl_cmd_key[] = { 0xa5, 0x01, 0x02, 0x03,
+   0xff, 0xfe, 0xfd, 0x5a };
+
 static int cyapa_gen5_initialize(struct cyapa *cyapa)
 {
struct cyapa_gen5_cmd_states *gen5_pip = >cmd_states.gen5;
@@ -618,6 +696,22 @@ static bool cyapa_gen5_sort_tsg_pip_app_resp_data(struct 
cyapa *cyapa,
return false;
 }
 
+static bool cyapa_gen5_sort_application_launch_data(struct cyapa *cyapa,
+   u8 *buf, int len)
+{
+   if (buf == NULL || len < GEN5_RESP_LENGTH_SIZE)
+   return false;
+
+   /*
+* After reset or power on, trackpad device always sets to 0x00 0x00
+* to indicate a reset or power on event.
+*/
+   if (buf[0] == 0 && buf[1] == 0)
+   return true;
+
+   return false;
+}
+
 static bool cyapa_gen5_sort_hid_descriptor_data(struct cyapa *cyapa,
u8 *buf, int len)
 {
@@ -923,6 +1017,80 @@ static int cyapa_gen5_state_parse(struct cyapa 

[PATCH v18 04/12] input: cyapa: add runtime power management interfaces support for the device

2015-01-15 Thread Dudley Du
Add runtime_suspend_scanrate_ms power management interfaces in device's
power group, so users or applications can control the runtime power
management strategy of trackpad device as their requirements.
TEST=test on Chromebooks.

Signed-off-by: Dudley Du 
---
 drivers/input/mouse/cyapa.c | 184 +++-
 drivers/input/mouse/cyapa.h |   4 +
 2 files changed, 187 insertions(+), 1 deletion(-)

diff --git a/drivers/input/mouse/cyapa.c b/drivers/input/mouse/cyapa.c
index eb0e515..43fa0cb 100644
--- a/drivers/input/mouse/cyapa.c
+++ b/drivers/input/mouse/cyapa.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "cyapa.h"
 
 
@@ -357,6 +358,10 @@ static int cyapa_open(struct input_dev *input)
}
 
enable_irq(client->irq);
+   if (!pm_runtime_enabled(>dev)) {
+   pm_runtime_set_active(>dev);
+   pm_runtime_enable(>dev);
+   }
 out:
mutex_unlock(>state_sync_lock);
return error;
@@ -370,6 +375,10 @@ static void cyapa_close(struct input_dev *input)
mutex_lock(>state_sync_lock);
 
disable_irq(client->irq);
+   if (pm_runtime_enabled(>dev))
+   pm_runtime_disable(>dev);
+   pm_runtime_set_suspended(>dev);
+
if (cyapa->operational)
cyapa->ops->set_power_mode(cyapa, PWR_MODE_OFF, 0);
 
@@ -539,6 +548,9 @@ static int cyapa_reinitialize(struct cyapa *cyapa)
struct input_dev *input = cyapa->input;
int error;
 
+   if (pm_runtime_enabled(dev))
+   pm_runtime_disable(dev);
+
/* Avoid command failures when TP was in OFF state. */
if (cyapa->operational)
cyapa->ops->set_power_mode(cyapa, PWR_MODE_FULL_ACTIVE, 0);
@@ -561,6 +573,13 @@ out:
/* Reset to power OFF state to save power when no user open. */
if (cyapa->operational)
cyapa->ops->set_power_mode(cyapa, PWR_MODE_OFF, 0);
+   } else if (!error && cyapa->operational) {
+   /*
+* Make sure only enable runtime PM when device is
+* in operational mode and input->users > 0.
+*/
+   pm_runtime_set_active(dev);
+   pm_runtime_enable(dev);
}
 
return error;
@@ -571,6 +590,7 @@ static irqreturn_t cyapa_irq(int irq, void *dev_id)
struct cyapa *cyapa = dev_id;
struct device *dev = >client->dev;
 
+   pm_runtime_get_sync(dev);
if (device_may_wakeup(dev))
pm_wakeup_event(dev, 0);
 
@@ -601,6 +621,8 @@ static irqreturn_t cyapa_irq(int irq, void *dev_id)
}
 
 out:
+   pm_runtime_mark_last_busy(dev);
+   pm_runtime_put_sync_autosuspend(dev);
return IRQ_HANDLED;
 }
 
@@ -725,6 +747,118 @@ static inline int cyapa_prepare_wakeup_controls(struct 
cyapa *cyapa)
 }
 #endif /* CONFIG_PM_SLEEP */
 
+#ifdef CONFIG_PM_RUNTIME
+static ssize_t cyapa_show_rt_suspend_scanrate(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+   struct cyapa *cyapa = dev_get_drvdata(dev);
+   u8 pwr_cmd;
+   u16 sleep_time;
+   int error;
+
+   error = mutex_lock_interruptible(>state_sync_lock);
+   if (error)
+   return error;
+   pwr_cmd = cyapa->runtime_suspend_power_mode;
+   sleep_time = cyapa->runtime_suspend_sleep_time;
+   mutex_unlock(>state_sync_lock);
+
+   if (cyapa->gen == CYAPA_GEN3)
+   return scnprintf(buf, PAGE_SIZE, "%u\n",
+   cyapa_pwr_cmd_to_sleep_time(pwr_cmd));
+   return scnprintf(buf, PAGE_SIZE, "%u\n", sleep_time);
+}
+
+static ssize_t cyapa_update_rt_suspend_scanrate(struct device *dev,
+   struct device_attribute *attr,
+   const char *buf, size_t count)
+{
+   struct cyapa *cyapa = dev_get_drvdata(dev);
+   u16 time;
+   int error;
+
+   if (buf == NULL || count == 0 || kstrtou16(buf, 10, )) {
+   dev_err(dev, "invalid runtime suspend scanrate ms parameter\n");
+   return -EINVAL;
+   }
+
+   /*
+* When the suspend scanrate is changed, pm_runtime_get to resume
+* a potentially suspended device, update to the new pwr_cmd
+* and then pm_runtime_put to suspend into the new power mode.
+*/
+   pm_runtime_get_sync(dev);
+   error = mutex_lock_interruptible(>state_sync_lock);
+   if (error)
+   return error;
+   cyapa->runtime_suspend_sleep_time = max_t(u16, time, 1000);
+   cyapa->runtime_suspend_power_mode =
+   cyapa_sleep_time_to_pwr_cmd(cyapa->runtime_suspend_sleep_time);
+   mutex_unlock(>state_sync_lock);
+   pm_runtime_put_sync_autosuspend(dev);
+
+   return count;
+}
+
+static 

[PATCH v18 11/12] input: cyapa: add gen5 trackpad device force re-calibrate function support

2015-01-15 Thread Dudley Du
Add force re-calibrate function supported for gen5 trackpad device,
it can be used through sysfs calibrate interface.
TEST=test on Chromebooks.

Signed-off-by: Dudley Du 
---
 drivers/input/mouse/cyapa_gen5.c | 65 
 1 file changed, 65 insertions(+)

diff --git a/drivers/input/mouse/cyapa_gen5.c b/drivers/input/mouse/cyapa_gen5.c
index 7c6e339..ced2a2c 100644
--- a/drivers/input/mouse/cyapa_gen5.c
+++ b/drivers/input/mouse/cyapa_gen5.c
@@ -1715,6 +1715,70 @@ static int cyapa_gen5_suspend_scanning(struct cyapa 
*cyapa)
return 0;
 }
 
+static int cyapa_gen5_calibrate_pwcs(struct cyapa *cyapa,
+   u8 calibrate_sensing_mode_type)
+{
+   struct gen5_app_cmd_head *app_cmd_head;
+   u8 cmd[8];
+   u8 resp_data[6];
+   int resp_len;
+   int error;
+
+   /* Try to dump all buffered data before doing command. */
+   cyapa_empty_pip_output_data(cyapa, NULL, NULL, NULL);
+
+   memset(cmd, 0, sizeof(cmd));
+   app_cmd_head = (struct gen5_app_cmd_head *)cmd;
+   put_unaligned_le16(GEN5_OUTPUT_REPORT_ADDR, _cmd_head->addr);
+   put_unaligned_le16(sizeof(cmd) - 2, _cmd_head->length);
+   app_cmd_head->report_id = GEN5_APP_CMD_REPORT_ID;
+   app_cmd_head->cmd_code = GEN5_CMD_CALIBRATE;
+   app_cmd_head->parameter_data[0] = calibrate_sensing_mode_type;
+   resp_len = sizeof(resp_data);
+   error = cyapa_i2c_pip_cmd_irq_sync(cyapa,
+   cmd, sizeof(cmd),
+   resp_data, _len,
+   5000, cyapa_gen5_sort_tsg_pip_app_resp_data, true);
+   if (error || !VALID_CMD_RESP_HEADER(resp_data, GEN5_CMD_CALIBRATE) ||
+   !GEN5_CMD_COMPLETE_SUCCESS(resp_data[5]))
+   return error < 0 ? error : -EAGAIN;
+
+   return 0;
+}
+
+static ssize_t cyapa_gen5_do_calibrate(struct device *dev,
+struct device_attribute *attr,
+const char *buf, size_t count)
+{
+   struct cyapa *cyapa = dev_get_drvdata(dev);
+   int error, calibrate_error;
+
+   /* 1. Suspend Scanning*/
+   error = cyapa_gen5_suspend_scanning(cyapa);
+   if (error)
+   return error;
+
+   /* 2. Do mutual capacitance fine calibrate. */
+   calibrate_error = cyapa_gen5_calibrate_pwcs(cyapa,
+   CYAPA_SENSING_MODE_MUTUAL_CAP_FINE);
+   if (calibrate_error)
+   goto resume_scanning;
+
+   /* 3. Do self capacitance calibrate. */
+   calibrate_error = cyapa_gen5_calibrate_pwcs(cyapa,
+   CYAPA_SENSING_MODE_SELF_CAP);
+   if (calibrate_error)
+   goto resume_scanning;
+
+resume_scanning:
+   /* 4. Resume Scanning*/
+   error = cyapa_gen5_resume_scanning(cyapa);
+   if (error || calibrate_error)
+   return error ? error : calibrate_error;
+
+   return count;
+}
+
 static s32 twos_complement_to_s32(s32 value, int num_bits)
 {
if (value >> (num_bits - 1))
@@ -2696,6 +2760,7 @@ const struct cyapa_dev_ops cyapa_gen5_ops = {
.bl_deactivate = cyapa_gen5_bl_deactivate,
 
.show_baseline = cyapa_gen5_show_baseline,
+   .calibrate_store = cyapa_gen5_do_calibrate,
 
.initialize = cyapa_gen5_initialize,
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v18 00/12] input: cyapa: instruction of cyapa patches

2015-01-15 Thread Dudley Du
V18 patches have below updates, details of other updates see history list:
1) Fix 'cyapa_runtime_suspend' undeclared compiling issue in latest
   kernel-next source code. Verified on next-20150114 with make C=1.
2) Optimize enter bootloader failed for some chipset takes longer time issue.


This patch series is aimed to re-design the cyapa driver to support
old gen3 trackpad devices and new gen5 trackpad devices in one
cyapa driver, it's for easily productions support based on
customers' requirements. And add sysfs functions and interfaces
supported that required by users and customers.

Since the earlier gen3 and the latest gen5 trackpad devices using
two different chipsets, and have different protocols and interfaces,
so if supported these two type trackpad devices in two different drivers,
then it will be difficult to manage productions and later firmware updates.
e.g.: It will cause customer don't know which one trackpad device firmware
image to use and update when it has been used and integrated
in same one productions, so here we support these two trackpad
devices in same on driver.


Dudley Du (12):
  input: cyapa: re-design driver to support multi-trackpad in one driver
  input: cyapa: add gen5 trackpad device basic functions support
  input: cyapa: add power management interfaces support for the device
  input: cyapa: add runtime power management interfaces support for the
device
  input: cyapa: add sysfs interfaces support in the cyapa driver
  input: cyapa: add gen3 trackpad device firmware update function
support
  input: cyapa: add gen3 trackpad device read baseline function support
  input: cyapa: add gen3 trackpad device force re-calibrate function
support
  input: cyapa: add gen5 trackpad device firmware update function
support
  input: cyapa: add gen5 trackpad device read baseline function support
  input: cyapa: add gen5 trackpad device force re-calibrate function
support
  input: cyapa: add acpi device id support

 drivers/input/mouse/Kconfig  |   11 +-
 drivers/input/mouse/Makefile |3 +-
 drivers/input/mouse/cyapa.c  | 1711 ++-
 drivers/input/mouse/cyapa.h  |  303 +
 drivers/input/mouse/cyapa_gen3.c | 1247 +
 drivers/input/mouse/cyapa_gen5.c | 2774 ++
 6 files changed, 5378 insertions(+), 671 deletions(-)
 create mode 100644 drivers/input/mouse/cyapa.h
 create mode 100644 drivers/input/mouse/cyapa_gen3.c
 create mode 100644 drivers/input/mouse/cyapa_gen5.c


History patch series modifications list:
V17 patches have below main updates compared with v16 patches:
1) Fix kernel oops when system booting up with finger on TP.
2) Remove unnecessary error log that may to system.
3) Slipt out pm sleep code into cyapa_prepare_wakeup_controls(),
   remove #indefs in function body of CONFIG_PM_SLEEP and CONFIG_PM_RUNTIME.
4) Supply stubs to cyapa_gen3_ops and cyapa_gen5_ops data structure to avoid
   checking for presence of various methods in ops.
5) Changing CYAPA_BOOTLOADER() and CYAPA_OPERATIONAL() macros to static inline
   functions as cyapa_is_bootloader_mode() and cyapa_is_operational_mode().
6) Remove touching runtime suspend state during system calling cyapa_suspend().
7) Change to enable runtime PM until make sure device is out of bootloader.
8) Change to return -EAGAIN when bootloader is busy.
9) Correct word spell issue and code styles.

V16 patches have below main updates compared with v15 patches:
1) Fix all miss-spelling and space issue.
2) Rename variables and functions with much more clearer names.
3) Initialize and document tries near where it will be used.
4) Modify cmd buffer to struct for more descriptive way.

V15 patches have below main updates compared with v14 patches:
1) Fix all warning errors of sparse tool when running with "make C=1".
2) Change variable name "unique_str" to "product_id" for clearer meanings.
3) Update cyapa_i2c_write function to return error directly when length > 31.

V14 patches have below main updates compared with v13 patches:
1) Correct 9 miss spelling issues of "bufferred" to "buffered".
2) Fix the upgrade issue of removing MOUSE_CYAPA config when make oldconfig
   by replase "depends on I2C && CRC_ITU_T" with
"depends on I2C"
"select CRC_ITU_T"
   in patch 9.

V13 patches have below main updates compared with v12 patches:
1) Remove all debugfs interface, including read_fw and raw_data interfaces.
2) This patches are made based linux next-20141208.

V12 patches have below main updates compared with v11 patches:
1) Add check that when TP is detected but not operational, do not exit driver
   immediately, but wait and export the update_fw interface for recovering.
2) Re-arrange the function codes, remove unnesseary protype definitions in
   the header file.

V11 patches have below main updates compared with v10 patches:
1) Add add acpi device id supported for old gen3 and new gen5 trackpad devices.
2) Fix the unable 

linux-next: manual merge of the usb-gadget tree with the usb.current tree

2015-01-15 Thread Stephen Rothwell
Hi Felipe,

Today's linux-next merge of the usb-gadget tree got a conflict in
drivers/usb/dwc2/gadget.c between commit 62f4f0651ce8 ("usb: dwc2:
gadget: kill requests with 'force' in s3c_hsotg_udc_stop()") from the
usb.current tree and commit c6f5c050e2a7 ("usb: dwc2: gadget: add
bi-directional endpoint support") and 1141ea01d5fa ("usb: dwc2: gadget:
kill requests after disabling ep") from the usb-gadget tree.

I fixed it up (I think - see below) and can carry the fix as necessary
(no action is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc drivers/usb/dwc2/gadget.c
index 79242008085b,882a1a8953f5..
--- a/drivers/usb/dwc2/gadget.c
+++ b/drivers/usb/dwc2/gadget.c
@@@ -2927,8 -2960,12 +2964,12 @@@ static int s3c_hsotg_udc_stop(struct us
mutex_lock(>init_mutex);
  
/* all endpoints should be shutdown */
-   for (ep = 1; ep < hsotg->num_of_eps; ep++)
-   s3c_hsotg_ep_disable_force(>eps[ep].ep, true);
+   for (ep = 1; ep < hsotg->num_of_eps; ep++) {
+   if (hsotg->eps_in[ep])
 -  s3c_hsotg_ep_disable(>eps_in[ep]->ep);
++  s3c_hsotg_ep_disable_force(>eps_in[ep]->ep, 
true);
+   if (hsotg->eps_out[ep])
 -  s3c_hsotg_ep_disable(>eps_out[ep]->ep);
++  s3c_hsotg_ep_disable_force(>eps_out[ep]->ep, 
true);
+   }
  
spin_lock_irqsave(>lock, flags);
  


pgpBTMQAtxO67.pgp
Description: OpenPGP digital signature


Re: [PATCH 2/2] staging: lustre: libcfs: fix space between function and open parenthesis

2015-01-15 Thread Jeremiah Mahler
Jia He,

On Fri, Jan 16, 2015 at 09:45:47AM +0800, Jia He wrote:
> This fixes the space warning checked by check_patch.pl
  ^^^
  checkpatch.pl
> 
[...]
> -static void kportal_memhog_free (struct libcfs_device_userstate *ldu)
> +static void kportal_memhog_free(struct libcfs_device_userstate *ldu)
>  {
[...]

I would have included this change in with the previous patch since it is
that patch that has the checkpatch warning.

-- 
- Jeremiah Mahler
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 1/4] Documentation: dt: add common bindings for hwspinlock

2015-01-15 Thread Ohad Ben-Cohen
On Thu, Jan 15, 2015 at 4:42 PM, Rob Herring  wrote:
> On Thu, Jan 15, 2015 at 7:55 AM, Mark Rutland  wrote:
>> On Thu, Jan 15, 2015 at 01:52:01PM +, Mark Rutland wrote:
>>> On Wed, Jan 14, 2015 at 08:58:18PM +, Suman Anna wrote:
>>> > +- hwlock-base-id:  An unique base Id for the locks for a particular 
>>> > hwlock
>>> > +   device. This property is mandatory for all platform
>>> > +   implementations.
>>>
>>> This property makes no sense. The ID encoded in the hwlock cells is
>>> relative to the instance (identified by phandle), not global. So the DT
>>> has no global ID space.
>>>
>>> Why do you think you need this?
>>
>> Having looked at the way this proeprty is used, NAK.
>>
>> If you need to carve up a Linux-internal ID space, do that dynamically.
>> There is no need for this property.
>
> Better yet, don't create a Linux ID space for this. Everywhere we have
> one, we want to get rid of it.

Rob, Mark,

The hwlock is a basic hardware primitive that allow synchronization
between different processors in the system, which may be running Linux
as well as other operating systems, and may have no other means of
communication.

The hwlock id numbers are predefined, global and static across the
entire system: Linux may boot well after other operating systems are
already running and using these hwlocks to communicate, and therefore,
in order to use these hardware devices, it must not enumerate them
differently than the rest of the system.

Given that these id numbers are global, system-wide, static and
predefined, where Linux may just be one user of them, please
reconsider the approach as implemented by Suman, or suggest an
alternative one you may prefer.

Thanks,
Ohad.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 5/5] KVM: nVMX: Enable nested posted interrupt processing.

2015-01-15 Thread Wincy Van
If vcpu has a interrupt in vmx non-root mode, we will
kick that vcpu to inject interrupt timely. With posted
interrupt processing, the kick intr is not needed, and
interrupts are fully taken care of by hardware.

In nested vmx, this feature avoids much more vmexits
than non-nested vmx.

This patch use L0's POSTED_INTR_NV to avoid unexpected
interrupt if L1's vector is different with L0's. If vcpu
is in hardware's non-root mode, we use a physical ipi to
deliver posted interrupts, otherwise we will deliver that
interrupt to L1 and kick that vcpu out of nested
non-root mode.

Signed-off-by: Wincy Van 
---
 arch/x86/kvm/vmx.c |  131 ++--
 1 files changed, 127 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index ea56e9f..5aeef79 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -215,6 +215,7 @@ struct __packed vmcs12 {
u64 tsc_offset;
u64 virtual_apic_page_addr;
u64 apic_access_addr;
+   u64 posted_intr_desc_addr;
u64 ept_pointer;
u64 eoi_exit_bitmap0;
u64 eoi_exit_bitmap1;
@@ -334,6 +335,7 @@ struct __packed vmcs12 {
u32 vmx_preemption_timer_value;
u32 padding32[7]; /* room for future expansion */
u16 virtual_processor_id;
+   u16 posted_intr_nv;
u16 guest_es_selector;
u16 guest_cs_selector;
u16 guest_ss_selector;
@@ -406,6 +408,8 @@ struct nested_vmx {
 */
struct page *apic_access_page;
struct page *virtual_apic_page;
+   struct page *pi_desc_page;
+   struct pi_desc *pi_desc;
u64 msr_ia32_feature_control;

struct hrtimer preemption_timer;
@@ -621,6 +625,7 @@ static int max_shadow_read_write_fields =

 static const unsigned short vmcs_field_to_offset_table[] = {
FIELD(VIRTUAL_PROCESSOR_ID, virtual_processor_id),
+   FIELD(POSTED_INTR_NV, posted_intr_nv),
FIELD(GUEST_ES_SELECTOR, guest_es_selector),
FIELD(GUEST_CS_SELECTOR, guest_cs_selector),
FIELD(GUEST_SS_SELECTOR, guest_ss_selector),
@@ -646,6 +651,7 @@ static const unsigned short vmcs_field_to_offset_table[] = {
FIELD64(TSC_OFFSET, tsc_offset),
FIELD64(VIRTUAL_APIC_PAGE_ADDR, virtual_apic_page_addr),
FIELD64(APIC_ACCESS_ADDR, apic_access_addr),
+   FIELD64(POSTED_INTR_DESC_ADDR, posted_intr_desc_addr),
FIELD64(EPT_POINTER, ept_pointer),
FIELD64(EOI_EXIT_BITMAP0, eoi_exit_bitmap0),
FIELD64(EOI_EXIT_BITMAP1, eoi_exit_bitmap1),
@@ -798,6 +804,7 @@ static void kvm_cpu_vmxon(u64 addr);
 static void kvm_cpu_vmxoff(void);
 static bool vmx_mpx_supported(void);
 static bool vmx_xsaves_supported(void);
+static int vmx_vm_has_apicv(struct kvm *kvm);
 static int vmx_set_tss_addr(struct kvm *kvm, unsigned int addr);
 static void vmx_set_segment(struct kvm_vcpu *vcpu,
struct kvm_segment *var, int seg);
@@ -1159,6 +1166,11 @@ static inline bool nested_cpu_has_vid(struct
vmcs12 *vmcs12)
return nested_cpu_has2(vmcs12, SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY);
 }

+static inline bool nested_cpu_has_posted_intr(struct vmcs12 *vmcs12)
+{
+   return vmcs12->pin_based_vm_exec_control & PIN_BASED_POSTED_INTR;
+}
+
 static inline bool is_exception(u32 intr_info)
 {
return (intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VALID_MASK))
@@ -2362,6 +2374,9 @@ static void nested_vmx_setup_ctls_msrs(struct
vcpu_vmx *vmx)
vmx->nested.nested_vmx_pinbased_ctls_high |=
PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR |
PIN_BASED_VMX_PREEMPTION_TIMER;
+   if (vmx_vm_has_apicv(vmx->vcpu.kvm))
+   vmx->nested.nested_vmx_pinbased_ctls_high |=
+   PIN_BASED_POSTED_INTR;

/* exit controls */
rdmsr(MSR_IA32_VMX_EXIT_CTLS,
@@ -4267,6 +4282,46 @@ static int vmx_vm_has_apicv(struct kvm *kvm)
return enable_apicv && irqchip_in_kernel(kvm);
 }

+static int vmx_deliver_nested_posted_interrupt(struct kvm_vcpu *vcpu,
+   int vector)
+{
+   int r = 0;
+   unsigned long flags;
+   struct vmcs12 *vmcs12;
+
+   /*
+* if vcpu is in L2, we are fast enough to complete
+* before L1 changes/destroys vmcs12.
+*/
+   local_irq_save(flags);
+   vmcs12 = get_vmcs12(vcpu);
+   if (!is_guest_mode(vcpu) || !vmcs12) {
+   r = -1;
+   goto out;
+   }
+   if (vector == vmcs12->posted_intr_nv &&
+   nested_cpu_has_posted_intr(vmcs12)) {
+   if (vcpu->mode == IN_GUEST_MODE)
+   apic->send_IPI_mask(get_cpu_mask(vcpu->cpu),
+   POSTED_INTR_VECTOR);
+   else {
+   r = -1;
+   goto out;
+   }
+
+   /*
+* if posted intr is done by hardware, the
+* corresponding eoi was 

[PATCH 4/5] KVM: nVMX: Enable nested virtual interrupt delivery.

2015-01-15 Thread Wincy Van
With virtual interrupt delivery, the hardware prevent KVM from
the low efficiency interrupt inject way. In nested vmx, it is
a important feature, we can reduce much more nested-vmexit,
especially in high throughput scenes.

Signed-off-by: Wincy Van 
---
 arch/x86/kvm/vmx.c |   49 +++--
 1 files changed, 47 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 99e19bb..ea56e9f 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -216,6 +216,10 @@ struct __packed vmcs12 {
u64 virtual_apic_page_addr;
u64 apic_access_addr;
u64 ept_pointer;
+   u64 eoi_exit_bitmap0;
+   u64 eoi_exit_bitmap1;
+   u64 eoi_exit_bitmap2;
+   u64 eoi_exit_bitmap3;
u64 xss_exit_bitmap;
u64 guest_physical_address;
u64 vmcs_link_pointer;
@@ -338,6 +342,7 @@ struct __packed vmcs12 {
u16 guest_gs_selector;
u16 guest_ldtr_selector;
u16 guest_tr_selector;
+   u16 guest_intr_status;
u16 host_es_selector;
u16 host_cs_selector;
u16 host_ss_selector;
@@ -624,6 +629,7 @@ static const unsigned short vmcs_field_to_offset_table[] = {
FIELD(GUEST_GS_SELECTOR, guest_gs_selector),
FIELD(GUEST_LDTR_SELECTOR, guest_ldtr_selector),
FIELD(GUEST_TR_SELECTOR, guest_tr_selector),
+   FIELD(GUEST_INTR_STATUS, guest_intr_status),
FIELD(HOST_ES_SELECTOR, host_es_selector),
FIELD(HOST_CS_SELECTOR, host_cs_selector),
FIELD(HOST_SS_SELECTOR, host_ss_selector),
@@ -641,6 +647,10 @@ static const unsigned short
vmcs_field_to_offset_table[] = {
FIELD64(VIRTUAL_APIC_PAGE_ADDR, virtual_apic_page_addr),
FIELD64(APIC_ACCESS_ADDR, apic_access_addr),
FIELD64(EPT_POINTER, ept_pointer),
+   FIELD64(EOI_EXIT_BITMAP0, eoi_exit_bitmap0),
+   FIELD64(EOI_EXIT_BITMAP1, eoi_exit_bitmap1),
+   FIELD64(EOI_EXIT_BITMAP2, eoi_exit_bitmap2),
+   FIELD64(EOI_EXIT_BITMAP3, eoi_exit_bitmap3),
FIELD64(XSS_EXIT_BITMAP, xss_exit_bitmap),
FIELD64(GUEST_PHYSICAL_ADDRESS, guest_physical_address),
FIELD64(VMCS_LINK_POINTER, vmcs_link_pointer),
@@ -1144,6 +1154,11 @@ static inline bool
nested_cpu_has_apic_reg_virt(struct vmcs12 *vmcs12)
return nested_cpu_has2(vmcs12, SECONDARY_EXEC_APIC_REGISTER_VIRT);
 }

+static inline bool nested_cpu_has_vid(struct vmcs12 *vmcs12)
+{
+   return nested_cpu_has2(vmcs12, SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY);
+}
+
 static inline bool is_exception(u32 intr_info)
 {
return (intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VALID_MASK))
@@ -2438,6 +2453,7 @@ static void nested_vmx_setup_ctls_msrs(struct
vcpu_vmx *vmx)
SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE |
SECONDARY_EXEC_APIC_REGISTER_VIRT |
+   SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
SECONDARY_EXEC_WBINVD_EXITING |
SECONDARY_EXEC_XSAVES;

@@ -7346,7 +7362,8 @@ static bool nested_vmx_exit_handled(struct kvm_vcpu *vcpu)
return nested_cpu_has2(vmcs12,
SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES);
case EXIT_REASON_APIC_WRITE:
-   /* apic_write should exit unconditionally. */
+   case EXIT_REASON_EOI_INDUCED:
+   /* apic_write and eoi_induced should exit unconditionally. */
return 1;
case EXIT_REASON_EPT_VIOLATION:
/*
@@ -8379,17 +8396,29 @@ static inline int
nested_vmx_check_virt_x2apic(struct kvm_vcpu *vcpu,
return 0;
 }

+static inline int nested_vmx_check_vid(struct kvm_vcpu *vcpu,
+  struct vmcs12 *vmcs12)
+{
+   if (!nested_exit_on_intr(vcpu))
+   return -EINVAL;
+   return 0;
+}
+
 static int nested_vmx_check_apicv_controls(struct kvm_vcpu *vcpu,
   struct vmcs12 *vmcs12)
 {
int r;

if (!nested_cpu_has_virt_x2apic_mode(vmcs12) &&
-   !nested_cpu_has_apic_reg_virt(vmcs12))
+   !nested_cpu_has_apic_reg_virt(vmcs12) &&
+   !nested_cpu_has_vid(vmcs12))
return 0;

if (nested_cpu_has_virt_x2apic_mode(vmcs12))
r = nested_vmx_check_virt_x2apic(vcpu, vmcs12);
+   if (nested_cpu_has_vid(vmcs12))
+   r |= nested_vmx_check_vid(vcpu, vmcs12);
+
if (r)
goto fail;

@@ -8705,6 +8734,19 @@ static void prepare_vmcs02(struct kvm_vcpu
*vcpu, struct vmcs12 *vmcs12)
kvm_vcpu_reload_apic_access_page(vcpu);
}

+   if (exec_control & SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY) {
+   vmcs_write64(EOI_EXIT_BITMAP0,
+   vmcs12->eoi_exit_bitmap0);
+   vmcs_write64(EOI_EXIT_BITMAP1,
+   

[PATCH 3/5] KVM: nVMX: Enable nested apic register virtualization.

2015-01-15 Thread Wincy Van
We can reduce apic register virtualization cost with this feature,
it is also a requirement for virtual interrupt delivery and posted
interrupt processing.

Signed-off-by: Wincy Van 
---
 arch/x86/kvm/vmx.c |   12 ++--
 1 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 10183ee..99e19bb 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1139,6 +1139,11 @@ static inline bool
nested_cpu_has_virt_x2apic_mode(struct vmcs12 *vmcs12)
return nested_cpu_has2(vmcs12, SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE);
 }

+static inline bool nested_cpu_has_apic_reg_virt(struct vmcs12 *vmcs12)
+{
+   return nested_cpu_has2(vmcs12, SECONDARY_EXEC_APIC_REGISTER_VIRT);
+}
+
 static inline bool is_exception(u32 intr_info)
 {
return (intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VALID_MASK))
@@ -2432,6 +2437,7 @@ static void nested_vmx_setup_ctls_msrs(struct
vcpu_vmx *vmx)
vmx->nested.nested_vmx_secondary_ctls_high &=
SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE |
+   SECONDARY_EXEC_APIC_REGISTER_VIRT |
SECONDARY_EXEC_WBINVD_EXITING |
SECONDARY_EXEC_XSAVES;

@@ -8378,10 +8384,12 @@ static int
nested_vmx_check_apicv_controls(struct kvm_vcpu *vcpu,
 {
int r;

-   if (!nested_cpu_has_virt_x2apic_mode(vmcs12))
+   if (!nested_cpu_has_virt_x2apic_mode(vmcs12) &&
+   !nested_cpu_has_apic_reg_virt(vmcs12))
return 0;

-   r = nested_vmx_check_virt_x2apic(vcpu, vmcs12);
+   if (nested_cpu_has_virt_x2apic_mode(vmcs12))
+   r = nested_vmx_check_virt_x2apic(vcpu, vmcs12);
if (r)
goto fail;

--
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/5] KVM: nVMX: Enable nested virtualize x2apic mode.

2015-01-15 Thread Wincy Van
When L2 is using x2apic, we can use virtualize x2apic mode to
gain higher performance.

This patch also introduces nested_vmx_check_apicv_controls
for the nested apicv patches.

Signed-off-by: Wincy Van 
---
 arch/x86/kvm/vmx.c |   49 -
 1 files changed, 48 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 954dd54..10183ee 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1134,6 +1134,11 @@ static inline bool nested_cpu_has_xsaves(struct
vmcs12 *vmcs12)
vmx_xsaves_supported();
 }

+static inline bool nested_cpu_has_virt_x2apic_mode(struct vmcs12 *vmcs12)
+{
+   return nested_cpu_has2(vmcs12, SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE);
+}
+
 static inline bool is_exception(u32 intr_info)
 {
return (intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VALID_MASK))
@@ -2426,6 +2431,7 @@ static void nested_vmx_setup_ctls_msrs(struct
vcpu_vmx *vmx)
vmx->nested.nested_vmx_secondary_ctls_low = 0;
vmx->nested.nested_vmx_secondary_ctls_high &=
SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
+   SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE |
SECONDARY_EXEC_WBINVD_EXITING |
SECONDARY_EXEC_XSAVES;

@@ -7333,6 +7339,9 @@ static bool nested_vmx_exit_handled(struct kvm_vcpu *vcpu)
case EXIT_REASON_APIC_ACCESS:
return nested_cpu_has2(vmcs12,
SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES);
+   case EXIT_REASON_APIC_WRITE:
+   /* apic_write should exit unconditionally. */
+   return 1;
case EXIT_REASON_EPT_VIOLATION:
/*
 * L0 always deals with the EPT violation. If nested EPT is
@@ -8356,6 +8365,38 @@ static void vmx_start_preemption_timer(struct
kvm_vcpu *vcpu)
  ns_to_ktime(preemption_timeout), HRTIMER_MODE_REL);
 }

+static inline int nested_vmx_check_virt_x2apic(struct kvm_vcpu *vcpu,
+  struct vmcs12 *vmcs12)
+{
+   if (nested_cpu_has2(vmcs12, SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES))
+   return -EINVAL;
+   return 0;
+}
+
+static int nested_vmx_check_apicv_controls(struct kvm_vcpu *vcpu,
+  struct vmcs12 *vmcs12)
+{
+   int r;
+
+   if (!nested_cpu_has_virt_x2apic_mode(vmcs12))
+   return 0;
+
+   r = nested_vmx_check_virt_x2apic(vcpu, vmcs12);
+   if (r)
+   goto fail;
+
+   /* tpr shadow is needed by all apicv features. */
+   if (!nested_cpu_has(vmcs12, CPU_BASED_TPR_SHADOW)) {
+   r = -EINVAL;
+   goto fail;
+   }
+
+   return 0;
+
+fail:
+   return r;
+}
+
 static int nested_vmx_check_msr_switch(struct kvm_vcpu *vcpu,
   unsigned long count_field,
   unsigned long addr_field,
@@ -8649,7 +8690,8 @@ static void prepare_vmcs02(struct kvm_vcpu
*vcpu, struct vmcs12 *vmcs12)
else
vmcs_write64(APIC_ACCESS_ADDR,
  page_to_phys(vmx->nested.apic_access_page));
-   } else if (vm_need_virtualize_apic_accesses(vmx->vcpu.kvm)) {
+   } else if (!(nested_cpu_has_virt_x2apic_mode(vmcs12)) &&
+   (vm_need_virtualize_apic_accesses(vmx->vcpu.kvm))) {
exec_control |=
SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES;
kvm_vcpu_reload_apic_access_page(vcpu);
@@ -8856,6 +8898,11 @@ static int nested_vmx_run(struct kvm_vcpu
*vcpu, bool launch)
return 1;
}

+   if (nested_vmx_check_apicv_controls(vcpu, vmcs12)) {
+   nested_vmx_failValid(vcpu, VMXERR_ENTRY_INVALID_CONTROL_FIELD);
+   return 1;
+   }
+
if (nested_vmx_check_msr_switch_controls(vcpu, vmcs12)) {
nested_vmx_failValid(vcpu, VMXERR_ENTRY_INVALID_CONTROL_FIELD);
return 1;
--
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [RFC PATCH] fs: file freeze support

2015-01-15 Thread Namjae Jeon

> > For implementation purpose, initially we tried to keep percpu usage counters
> > inside struct inode just like there is struct sb_writers in super_block.
> > But considering that it will significantly bloat up struct inode when 
> > actually
> > the usage of file write freeze will be infrequent, we dropped this idea.
> > Instead we have tried to use already present filesystem freezing 
> > infrastructure.
> > Current approach makes it possible for implementing file write freeze 
> > without
> > bloating any of struct super_block/inode.
> > In FS_IOC_FWFREEZE, we wait for complete fs to be frozen, set 
> > I_WRITE_FREEZED to
> > inode's state and unfreeze the fs.
> Looks interesting. I have added some comments below.
Hi Dmitry,
First, Thanks for your opinion.
> >

> > @@ -40,6 +40,7 @@ static int f2fs_vm_page_mkwrite(struct vm_area_struct 
> > *vma,
> >
> > f2fs_balance_fs(sbi);
> >
> > +   inode_start_write(inode);
> > sb_start_pagefault(inode->i_sb);
> IMHO it is reasonable to fold sb_start_{write,pagefault}, to 
> inode_start_{write,pagefault}
Agree.
> >
> > +void inode_start_write(struct inode *inode)
> > +{
> > +   struct super_block *sb = inode->i_sb;
> > +
> > +retry:
> > +   spin_lock(>i_lock);
> This means that i_lock will be acquired on each mkpage_write for all
> users who do not care about fsfreeze which result smp performance drawback
> It is reasonable to add lockless test first because flag is set while
> whole fs is frozen so we can not enter this routine.
Right, I will remove it.
> 
> > +   if (inode->i_state & I_WRITE_FREEZED) {
> > +   DEFINE_WAIT(wait);
> > +
> > +   prepare_to_wait(>s_writers.wait_unfrozen, ,
> > +   TASK_UNINTERRUPTIBLE);
> > +   spin_unlock(>i_lock);
> > +   schedule();
> > +   finish_wait(>s_writers.wait_unfrozen, );
> > +   goto retry;
> > +   }
> > +   spin_unlock(>i_lock);
> > +}
> > diff --git a/fs/ioctl.c b/fs/ioctl.c
> > index 214c3c1..c8e9ae3 100644
> > --- a/fs/ioctl.c
> > +++ b/fs/ioctl.c
> > @@ -540,6 +540,28 @@ static int ioctl_fsthaw(struct file *filp)
> > return thaw_super(sb);
> >  }
> >
> > +static int ioctl_filefreeze(struct file *filp)
> > +{
> > +   struct inode *inode = file_inode(filp);
> > +
> > +   if (!inode_owner_or_capable(inode))
> > +   return -EPERM;
> > +
> > +   /* Freeze */
> > +   return file_write_freeze(inode);
> > +}
> 
> > +
> > +static int ioctl_filethaw(struct file *filp)
> > +{
> > +   struct inode *inode = file_inode(filp);
> > +
> > +   if (!inode_owner_or_capable(inode))
> > +   return -EPERM;
> > +
> > +   /* Thaw */
> > +   return file_write_unfreeze(inode);
> > +}
> > +
> >  /*
> >   * When you add any new common ioctls to the switches above and below
> >   * please update compat_sys_ioctl() too.
> > @@ -589,6 +611,14 @@ int do_vfs_ioctl(struct file *filp, unsigned int fd, 
> > unsigned int cmd,
> > error = ioctl_fsthaw(filp);
> > break;
> >
> > +   case FS_IOC_FWFREEZE:
> > +   error = ioctl_filefreeze(filp);
> > +   break;
> > +
> > +   case FS_IOC_FWTHAW:
> > +   error = ioctl_filethaw(filp);
> > +   break;
> > +
> > case FS_IOC_FIEMAP:
> > return ioctl_fiemap(filp, arg);
> >
> > diff --git a/fs/nilfs2/file.c b/fs/nilfs2/file.c
> > index 3a03e0a..5110d9d 100644
> > --- a/fs/nilfs2/file.c
> > +++ b/fs/nilfs2/file.c
> > @@ -66,6 +66,7 @@ static int nilfs_page_mkwrite(struct vm_area_struct *vma, 
> > struct vm_fault *vmf)
> > if (unlikely(nilfs_near_disk_full(inode->i_sb->s_fs_info)))
> > return VM_FAULT_SIGBUS; /* -ENOSPC */
> >
> > +   inode_start_write(file_inode(vma->vm_file));
> > sb_start_pagefault(inode->i_sb);
> > lock_page(page);
> > if (page->mapping != inode->i_mapping ||
> > diff --git a/fs/ocfs2/mmap.c b/fs/ocfs2/mmap.c
> > index 10d66c7..d073fc2 100644
> > --- a/fs/ocfs2/mmap.c
> > +++ b/fs/ocfs2/mmap.c
> > @@ -136,6 +136,7 @@ static int ocfs2_page_mkwrite(struct vm_area_struct 
> > *vma, struct vm_fault *vmf)
> > sigset_t oldset;
> > int ret;
> >
> > +   inode_start_write(inode);
> > sb_start_pagefault(inode->i_sb);
> > ocfs2_block_signals();
> >
> > diff --git a/fs/super.c b/fs/super.c
> > index eae088f..5e44e42 100644
> > --- a/fs/super.c
> > +++ b/fs/super.c
> > @@ -1393,3 +1393,54 @@ out:
> > return 0;
> >  }
> >  EXPORT_SYMBOL(thaw_super);
> > +
> IMHO it is reasonable to open code this procedure so user is responsible
> for calling  freeze_super(), thaw_super() . This allow to call for
> several inodes in a row like follows:
> 
> ioctl(sb,FIFREEZE)
> while (f = pop(files_list))
>   ioctl(f,FS_IOC_FWFREEZE)
> ioctl(sb,FITHAW)
> 
> This required for directory defragmentation(small files compacting)
Good point, I will check your point on V2.

Thanks!
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org

[PATCH 1/5] KVM: nVMX: Make nested control MSRs per-cpu.

2015-01-15 Thread Wincy Van
To enable nested apicv support, we need per-cpu vmx
control MSRs:
  1. If in-kernel irqchip is enabled, we can enable nested
 posted interrupt, we should set posted intr bit in
 the nested_vmx_pinbased_ctls_high.
  2. If in-kernel irqchip is disabled, we can not enable
 nested posted interrupt, the posted intr bit
 in the nested_vmx_pinbased_ctls_high will be cleared.

Since there would be different settings about in-kernel
irqchip between VMs, different nested control MSRs
are needed.

Signed-off-by: Wincy Van 
---
 arch/x86/kvm/vmx.c |  215 +++-
 1 files changed, 129 insertions(+), 86 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index ce35071..954dd54 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -408,6 +408,23 @@ struct nested_vmx {

/* to migrate it to L2 if VM_ENTRY_LOAD_DEBUG_CONTROLS is off */
u64 vmcs01_debugctl;
+
+   u32 nested_vmx_procbased_ctls_low;
+   u32 nested_vmx_procbased_ctls_high;
+   u32 nested_vmx_true_procbased_ctls_low;
+   u32 nested_vmx_secondary_ctls_low;
+   u32 nested_vmx_secondary_ctls_high;
+   u32 nested_vmx_pinbased_ctls_low;
+   u32 nested_vmx_pinbased_ctls_high;
+   u32 nested_vmx_exit_ctls_low;
+   u32 nested_vmx_exit_ctls_high;
+   u32 nested_vmx_true_exit_ctls_low;
+   u32 nested_vmx_entry_ctls_low;
+   u32 nested_vmx_entry_ctls_high;
+   u32 nested_vmx_true_entry_ctls_low;
+   u32 nested_vmx_misc_low;
+   u32 nested_vmx_misc_high;
+   u32 nested_vmx_ept_caps;
 };

 #define POSTED_INTR_ON  0
@@ -2289,20 +2306,8 @@ static inline bool nested_vmx_allowed(struct
kvm_vcpu *vcpu)
  * if the corresponding bit in the (32-bit) control field *must* be on, and a
  * bit in the high half is on if the corresponding bit in the control field
  * may be on. See also vmx_control_verify().
- * TODO: allow these variables to be modified (downgraded) by module options
- * or other means.
  */
-static u32 nested_vmx_procbased_ctls_low, nested_vmx_procbased_ctls_high;
-static u32 nested_vmx_true_procbased_ctls_low;
-static u32 nested_vmx_secondary_ctls_low, nested_vmx_secondary_ctls_high;
-static u32 nested_vmx_pinbased_ctls_low, nested_vmx_pinbased_ctls_high;
-static u32 nested_vmx_exit_ctls_low, nested_vmx_exit_ctls_high;
-static u32 nested_vmx_true_exit_ctls_low;
-static u32 nested_vmx_entry_ctls_low, nested_vmx_entry_ctls_high;
-static u32 nested_vmx_true_entry_ctls_low;
-static u32 nested_vmx_misc_low, nested_vmx_misc_high;
-static u32 nested_vmx_ept_caps;
-static __init void nested_vmx_setup_ctls_msrs(void)
+static void nested_vmx_setup_ctls_msrs(struct vcpu_vmx *vmx)
 {
/*
 * Note that as a general rule, the high half of the MSRs (bits in
@@ -2321,57 +2326,71 @@ static __init void nested_vmx_setup_ctls_msrs(void)

/* pin-based controls */
rdmsr(MSR_IA32_VMX_PINBASED_CTLS,
- nested_vmx_pinbased_ctls_low, nested_vmx_pinbased_ctls_high);
-   nested_vmx_pinbased_ctls_low |= PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR;
-   nested_vmx_pinbased_ctls_high &= PIN_BASED_EXT_INTR_MASK |
-   PIN_BASED_NMI_EXITING | PIN_BASED_VIRTUAL_NMIS;
-   nested_vmx_pinbased_ctls_high |= PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR |
+   vmx->nested.nested_vmx_pinbased_ctls_low,
+   vmx->nested.nested_vmx_pinbased_ctls_high);
+   vmx->nested.nested_vmx_pinbased_ctls_low |=
+   PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR;
+   vmx->nested.nested_vmx_pinbased_ctls_high &=
+   PIN_BASED_EXT_INTR_MASK |
+   PIN_BASED_NMI_EXITING |
+   PIN_BASED_VIRTUAL_NMIS;
+   vmx->nested.nested_vmx_pinbased_ctls_high |=
+   PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR |
PIN_BASED_VMX_PREEMPTION_TIMER;

/* exit controls */
rdmsr(MSR_IA32_VMX_EXIT_CTLS,
-   nested_vmx_exit_ctls_low, nested_vmx_exit_ctls_high);
-   nested_vmx_exit_ctls_low = VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR;
+   vmx->nested.nested_vmx_exit_ctls_low,
+   vmx->nested.nested_vmx_exit_ctls_high);
+   vmx->nested.nested_vmx_exit_ctls_low =
+   VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR;

-   nested_vmx_exit_ctls_high &=
+   vmx->nested.nested_vmx_exit_ctls_high &=
 #ifdef CONFIG_X86_64
VM_EXIT_HOST_ADDR_SPACE_SIZE |
 #endif
VM_EXIT_LOAD_IA32_PAT | VM_EXIT_SAVE_IA32_PAT;
-   nested_vmx_exit_ctls_high |= VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR |
+   vmx->nested.nested_vmx_exit_ctls_high |=
+   VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR |
VM_EXIT_LOAD_IA32_EFER | VM_EXIT_SAVE_IA32_EFER |
VM_EXIT_SAVE_VMX_PREEMPTION_TIMER | VM_EXIT_ACK_INTR_ON_EXIT;

if (vmx_mpx_supported())
-   nested_vmx_exit_ctls_high |= VM_EXIT_CLEAR_BNDCFGS;
+   vmx->nested.nested_vmx_exit_ctls_high |= 

Re: [RFC 01/11] i2c: add quirk structure to describe adapter flaws

2015-01-15 Thread Eddie Huang
Hi Wolfram,

On Fri, 2015-01-09 at 18:21 +0100, Wolfram Sang wrote:
>  
> + */
> +struct i2c_adapter_quirks {
> + u64 flags;
> + int max_num_msgs;
> + u16 max_write_len;
> + u16 max_read_len;
> + u16 max_comb_write_len;
> + u16 max_comb_read_len;
> +};
> +
> +#define I2C_ADAPTER_QUIRK_COMB_WRITE_FIRST   BIT(0)
> +#define I2C_ADAPTER_QUIRK_COMB_READ_SECOND   BIT(1)
> +#define I2C_ADAPTER_QUIRK_COMB_WRITE_THEN_READ   
> (I2C_ADAPTER_QUIRK_COMB_WRITE_FIRST | \
> + 
> I2C_ADAPTER_QUIRK_COMB_READ_SECOND)
> +
>  /*
>   * i2c_adapter is the structure used to identify a physical i2c bus along
>   * with the access algorithms necessary to access it.
> @@ -472,6 +506,7 @@ struct i2c_adapter {
>   struct list_head userspace_clients;
>  
>   struct i2c_bus_recovery_info *bus_recovery_info;
> + struct i2c_adapter_quirks *quirks;
>  };
>  #define to_i2c_adapter(d) container_of(d, struct i2c_adapter, dev)
>  

I suggest to add const.
const struct i2c_adapter_quirks *quirks;

also, in i2c-core.c, should modify:
const struct i2c_adapter_quirks *q = adap->quirks;



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/5] KVM: nVMX: Enable nested apicv support.

2015-01-15 Thread Wincy Van
In nested vmx, the efficiency of interrupt virtualization is
very important, especially in high throughput scenes.

This patch set enables nested apicv support, which makes a
huge improvement in nested interrupt virtualization.

I also have done some simple tests:
L0: Intel Xeon E5-2630 v2
L1: CentOS 6.5 with 3.10.64-1.el6.elrepo.x86_64 kernel
16 vcpus, 32GB memory.

L2: Windows Server 2008 R2 Datacenter
8 vcpus, 16GB memory.

 1. Run wprime 32M, 8 threads.

 originalnested apicv

  7.782s   7.172s

Improvement: 7.8%

 2. Run iperf -s -w 64k in L1,
iperf -c 10.1.0.2 -p 5001 -i 1 -t 30 -P 8 -w 64k in L2

 originalnested apicv

2.12 Gbits/s 3.50 Gbits/s

Improvement: 65.0%

_

L2: CentOS 6.5 with 2.6.32-431 kernel
8 vcpus, 16GB memory.

 1. Run iperf -s -w 64k in L1,
iperf -c 10.1.0.2 -p 5001 -i 1 -t 30 -P 8 -w 64k in L2

 originalnested apicv

6.58 Gbits/s 14.2 Gbits/s

Improvement: 115.8%

Wincy Van (5):
  KVM: nVMX: Make nested control MSRs per-cpu.
  KVM: nVMX: Enable nested virtualize x2apic mode.
  KVM: nVMX: Enable nested apic register virtualization.
  KVM: nVMX: Enable nested virtual interrupt delivery.
  KVM: nVMX: Enable nested posted interrupt processing.

 arch/x86/kvm/vmx.c |  444 +---
 1 files changed, 355 insertions(+), 89 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 2/3] net/macb: Add whitespace around arithmetic operators

2015-01-15 Thread David Miller
From: Xander Huff 
Date: Thu, 15 Jan 2015 15:45:15 -0600

> Spaces should surround add, multiply, and bitshift operators.
> 
> Signed-off-by: Xander Huff 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 1/3] net/macb: Fix comments to meet style guidelines

2015-01-15 Thread David Miller
From: Xander Huff 
Date: Thu, 15 Jan 2015 15:45:14 -0600

> Change comments to not exceed 80 characters per line.
> Update block comments in macb.h to start on the line after /*.
> 
> Signed-off-by: Xander Huff 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/3] net/macb: Create gem_ethtool_ops for new statistics functions

2015-01-15 Thread David Miller
From: Xander Huff 
Date: Thu, 15 Jan 2015 15:45:16 -0600

> 10/100 MACB does not have the same statistics possibilities as GEM. Separate
> macb_ethtool_ops to make a new GEM-specific struct with the new statistics
> functions included.
> 
> Signed-off-by: Xander Huff 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net-next] Driver: Vmxnet3: Fix ethtool -S to return correct rx queue stats

2015-01-15 Thread David Miller
From: Shrikrishna Khare 
Date: Thu, 15 Jan 2015 11:54:30 -0800

> Signed-off-by: Gao Zhenyu 
> Signed-off-by: Shrikrishna Khare 
> Reviewed-by: Shreyas N Bhatewara 

Applied, thank you.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 0/2 shit_A shit_B] workqueue: fix wq_numa bug

2015-01-15 Thread Yasuaki Ishimatsu
Hi Lai,

Thanks you for posting the patch-set.

I'll try your it next Monday. So, please wait a while.

Thanks,
Yasuaki Ishimatsu


(2015/01/14 17:54), Lai Jiangshan wrote:
> Hi, All
> 
> This patches are un-changloged, un-compiled, un-booted, un-tested,
> they are just shits, I even hope them un-sent or blocked.
> 
> The patches include two -solutions-:
> 
> Shit_A:
>workqueue: reset pool->node and unhash the pool when the node is
>  offline
>update wq_numa when cpu_present_mask changed
> 
>   kernel/workqueue.c | 107 
> +
>   1 file changed, 84 insertions(+), 23 deletions(-)
> 
> 
> Shit_B:
>workqueue: reset pool->node and unhash the pool when the node is
>  offline
>workqueue: remove wq_numa_possible_cpumask
>workqueue: directly update attrs of pools when cpu hot[un]plug
> 
>   kernel/workqueue.c | 135 
> +++--
>   1 file changed, 101 insertions(+), 34 deletions(-)
> 
> 
> Both patch1 of the both solutions are: reset pool->node and unhash the pool,
> it is suggested by TJ, I found it is a good leading-step for fixing the bug.
> 
> The other patches are handling wq_numa_possible_cpumask where the solutions
> diverge.
> 
> Solution_A uses present_mask rather than possible_cpumask. It adds
> wq_numa_notify_cpu_present_set/cleared() for notifications of
> the changes of cpu_present_mask.  But the notifications are un-existed
> right now, so I fake one (wq_numa_check_present_cpumask_changes())
> to imitate them.  I hope the memory people add a real one.
> 
> Solution_B uses online_mask rather than possible_cpumask.
> this solution remove more coupling between numa_code and workqueue,
> it just depends on cpumask_of_node(node).
> 
> Patch2_of_Solution_B removes the wq_numa_possible_cpumask and add
> overhead when cpu hot[un]plug, Patch3 reduce this overhead.
> 
> Thanks,
> Lai
> 
> 
> Reported-by: Yasuaki Ishimatsu 
> Cc: Tejun Heo 
> Cc: Yasuaki Ishimatsu 
> Cc: "Gu, Zheng" 
> Cc: tangchen 
> Cc: Hiroyuki KAMEZAWA 
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] mm: vmscan: fix the page state calculation in too_many_isolated

2015-01-15 Thread Vinayak Menon

On 01/16/2015 06:47 AM, Andrew Morton wrote:

On Wed, 14 Jan 2015 17:06:59 +0530 Vinayak Menon  
wrote:


It is observed that sometimes multiple tasks get blocked for long
in the congestion_wait loop below, in shrink_inactive_list. This
is because of vm_stat values not being synced.

(__schedule) from []
(schedule_timeout) from []
(io_schedule_timeout) from []
(congestion_wait) from []
(shrink_inactive_list) from []
(shrink_zone) from []
(try_to_free_pages) from []
(__alloc_pages_nodemask) from []
(new_slab) from []
(__slab_alloc) from []

In one such instance, zone_page_state(zone, NR_ISOLATED_FILE)
had returned 14, zone_page_state(zone, NR_INACTIVE_FILE)
returned 92, and GFP_IOFS was set, and this resulted
in too_many_isolated returning true. But one of the CPU's
pageset vm_stat_diff had NR_ISOLATED_FILE as "-14". So the
actual isolated count was zero. As there weren't any more
updates to NR_ISOLATED_FILE and vmstat_update deffered work
had not been scheduled yet, 7 tasks were spinning in the
congestion wait loop for around 4 seconds, in the direct
reclaim path.

This patch uses zone_page_state_snapshot instead, but restricts
its usage to avoid performance penalty.


Seems reasonable.



...

@@ -1516,15 +1531,18 @@ shrink_inactive_list(unsigned long nr_to_scan, struct 
lruvec *lruvec,
unsigned long nr_immediate = 0;
isolate_mode_t isolate_mode = 0;
int file = is_file_lru(lru);
+   int safe = 0;
struct zone *zone = lruvec_zone(lruvec);
struct zone_reclaim_stat *reclaim_stat = >reclaim_stat;

-   while (unlikely(too_many_isolated(zone, file, sc))) {
+   while (unlikely(too_many_isolated(zone, file, sc, safe))) {
congestion_wait(BLK_RW_ASYNC, HZ/10);

/* We are about to die and free our memory. Return now. */
if (fatal_signal_pending(current))
return SWAP_CLUSTER_MAX;
+
+   safe = 1;
}


But here and under the circumstances you describe, we'll call
congestion_wait() a single time.  That shouldn't have occurred.

So how about we put the fallback logic into too_many_isolated() itself?




congestion_wait was allowed to run once as an optimization, considering 
that __too_many_isolated (unsafe and faster) can be correct in returning 
true most of the time. So we avoid calling the safe version, in most of 
the cases. But I agree that we should not call congestion_wait 
unnecessarily even in those rare cases. So this looks correct to me.





From: Andrew Morton 
Subject: mm-vmscan-fix-the-page-state-calculation-in-too_many_isolated-fix

Move the zone_page_state_snapshot() fallback logic into
too_many_isolated(), so shrink_inactive_list() doesn't incorrectly call
congestion_wait().

Cc: Johannes Weiner 
Cc: Mel Gorman 
Cc: Michal Hocko 
Cc: Minchan Kim 
Cc: Vinayak Menon 
Cc: Vladimir Davydov 
Signed-off-by: Andrew Morton 
---

  mm/vmscan.c |   23 +++
  1 file changed, 11 insertions(+), 12 deletions(-)

diff -puN 
mm/vmscan.c~mm-vmscan-fix-the-page-state-calculation-in-too_many_isolated-fix 
mm/vmscan.c
--- 
a/mm/vmscan.c~mm-vmscan-fix-the-page-state-calculation-in-too_many_isolated-fix
+++ a/mm/vmscan.c
@@ -1402,7 +1402,7 @@ int isolate_lru_page(struct page *page)
  }

  static int __too_many_isolated(struct zone *zone, int file,
-   struct scan_control *sc, int safe)
+  struct scan_control *sc, int safe)
  {
unsigned long inactive, isolated;

@@ -1435,7 +1435,7 @@ static int __too_many_isolated(struct zo
   * unnecessary swapping, thrashing and OOM.
   */
  static int too_many_isolated(struct zone *zone, int file,
-   struct scan_control *sc, int safe)
+struct scan_control *sc)
  {
if (current_is_kswapd())
return 0;
@@ -1443,12 +1443,14 @@ static int too_many_isolated(struct zone
if (!global_reclaim(sc))
return 0;

-   if (unlikely(__too_many_isolated(zone, file, sc, 0))) {
-   if (safe)
-   return __too_many_isolated(zone, file, sc, safe);
-   else
-   return 1;
-   }
+   /*
+* __too_many_isolated(safe=0) is fast but inaccurate, because it
+* doesn't account for the vm_stat_diff[] counters.  So if it looks
+* like too_many_isolated() is about to return true, fall back to the
+* slower, more accurate zone_page_state_snapshot().
+*/
+   if (unlikely(__too_many_isolated(zone, file, sc, 0)))
+   return __too_many_isolated(zone, file, sc, safe);

return 0;
  }
@@ -1540,18 +1542,15 @@ shrink_inactive_list(unsigned long nr_to
unsigned long nr_immediate = 0;
isolate_mode_t isolate_mode = 0;
int file = is_file_lru(lru);
-   int safe = 0;
struct zone *zone = lruvec_zone(lruvec);
struct zone_reclaim_stat *reclaim_stat = >reclaim_stat;

-   while 

Re: [PATCH v4] gpio_wdt: Add "always_running" feature to GPIO watchdog

2015-01-15 Thread Guenter Roeck

On 01/13/2015 10:28 PM, Mike Looijmans wrote:

On some chips, like the TPS386000, the trigger cannot be disabled
and the CPU must keep toggling the line at all times. Add a switch
"always_running" to keep toggling the GPIO line regardless of the
state of the soft part of the watchdog. The "armed" member keeps
track of whether a timeout must also cause a reset.

Signed-off-by: Mike Looijmans 


Reviewed-by: Guenter Roeck 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] gpio: gpio-dln2: Added a Blank line after declaration

2015-01-15 Thread Jamal Mohammad
I think you are write ... checkpatch.pl was giving the error at the
line so i added a blank line... i will send an updated patch..

On Thu, Jan 15, 2015 at 11:40 PM, Johan Hovold  wrote:
> On Thu, Jan 15, 2015 at 06:20:43PM +0100, Linus Walleij wrote:
>> On Tue, Jan 13, 2015 at 4:09 PM, Mohammad Jamal
>>  wrote:
>>
>> > Fix the coding style issue by adding a blank line after declaration
>> >
>> > Signed-off-by: Mohammad Jamal 
>>
>> Patch applied.
>
> This one looks bogus; it's adding a random newline within the
> declarations not after.
>
> Johan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


YOU HAVE WON!!

2015-01-15 Thread POSTCODESLOTERIJ.NL



We are please to inform you of our new year promotion award program held 14th
January 2015,which have pronounce your email-address as the lucky winner
of a total sum:of €250,000.
CONTACT:
The claim office/Email:postc...@luckymail.com,with your winning Serial-No:
LE0031#11250PSNL.
***
Postcodesloterij.nl

WARNING!!
This winning notification is expected to be read and/or used only by the
individual(s) for whom it is intended.If you have received this electronic
communication in error, please reply to the sender advising of the error
in transmission and delete the original message FOR SECURITY REASON
:**

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3] X86-32: Allocate 32 bytes for pgd in PAE paging

2015-01-15 Thread Fenghua Yu
From: Fenghua Yu 

With more embedded systems emerging using Quark, among other things,
32-bit kernel matters again. 32-bit machine and kernel uses PAE paging,
which currently wastes at least 4K of memory per process on Linux where we
have to reserve an entire page to support a single 32-byte PGD structure.
It would be a very good thing if we could eliminate that wastage.

PAE paging is used to access more than 4GB memory on x86-32. And it is
required for NX.

In this patch, we still allocate one page for pgd for a Xen domain and 64-bit
kernel because one page pgd is assumed in these cases. But we can save memory
space by only allocating 32-byte pgd for 32-bit PAE kernel when it is not
running as a Xen domain.

Signed-off-by: Fenghua Yu 
---
 arch/x86/mm/pgtable.c | 81 +--
 1 file changed, 78 insertions(+), 3 deletions(-)

diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index 6fb6927..d223e1f 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -271,12 +271,87 @@ static void pgd_prepopulate_pmd(struct mm_struct *mm, 
pgd_t *pgd, pmd_t *pmds[])
}
 }
 
+/*
+ * Xen paravirt assumes pgd table should be in one page. 64 bit kernel also
+ * assumes that pgd should be in one page.
+ *
+ * But kernel with PAE paging that is not running as a Xen domain
+ * only needs to allocate 32 bytes for pgd instead of one page.
+ */
+#ifdef CONFIG_X86_PAE
+
+#include 
+
+#define PGD_SIZE   (PTRS_PER_PGD * sizeof(pgd_t))
+#define PGD_ALIGN  32
+
+static struct kmem_cache *pgd_cache;
+
+static int __init pgd_cache_init(void)
+{
+   /*
+* When PAE kernel is running as a Xen domain, it does not use
+* shared kernel pmd. And this requires a whole page for pgd.
+*/
+   if (!SHARED_KERNEL_PMD)
+   return 0;
+
+   /*
+* when PAE kernel is not running as a Xen domain, it uses
+* shared kernel pmd. Shared kernel pmd does not require a whole
+* page for pgd. We are able to just allocate a 32-byte for pgd.
+* During boot time, we create a 32-byte slab for pgd table allocation.
+*/
+   pgd_cache = kmem_cache_create("pgd_cache", PGD_SIZE, PGD_ALIGN,
+ SLAB_PANIC, NULL);
+   if (!pgd_cache)
+   return -ENOMEM;
+
+   return 0;
+}
+core_initcall(pgd_cache_init);
+
+static inline pgd_t *_pgd_alloc(void)
+{
+   /*
+* If no SHARED_KERNEL_PMD, PAE kernel is running as a Xen domain.
+* We allocate one page for pgd.
+*/
+   if (!SHARED_KERNEL_PMD)
+   return (pgd_t *)__get_free_page(PGALLOC_GFP);
+
+   /*
+* Now PAE kernel is not running as a Xen domain. We can allocate
+* a 32-byte slab for pgd to save memory space.
+*/
+   return kmem_cache_alloc(pgd_cache, PGALLOC_GFP);
+}
+
+static inline void _pgd_free(pgd_t *pgd)
+{
+   if (!SHARED_KERNEL_PMD)
+   free_page((unsigned long)pgd);
+   else
+   kmem_cache_free(pgd_cache, pgd);
+}
+#else
+static inline pgd_t *_pgd_alloc(void)
+{
+   return (pgd_t *)__get_free_page(PGALLOC_GFP);
+}
+
+static inline void _pgd_free(pgd_t *pgd)
+{
+   free_page((unsigned long)pgd);
+}
+#endif /* CONFIG_X86_PAE */
+
 pgd_t *pgd_alloc(struct mm_struct *mm)
 {
pgd_t *pgd;
pmd_t *pmds[PREALLOCATED_PMDS];
 
-   pgd = (pgd_t *)__get_free_page(PGALLOC_GFP);
+   pgd = _pgd_alloc();
 
if (pgd == NULL)
goto out;
@@ -306,7 +381,7 @@ pgd_t *pgd_alloc(struct mm_struct *mm)
 out_free_pmds:
free_pmds(pmds);
 out_free_pgd:
-   free_page((unsigned long)pgd);
+   _pgd_free(pgd);
 out:
return NULL;
 }
@@ -316,7 +391,7 @@ void pgd_free(struct mm_struct *mm, pgd_t *pgd)
pgd_mop_up_pmds(mm, pgd);
pgd_dtor(pgd);
paravirt_pgd_free(mm, pgd);
-   free_page((unsigned long)pgd);
+   _pgd_free(pgd);
 }
 
 /*
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: manual merge of the wireless-drivers-next tree with the wireless-drivers tree

2015-01-15 Thread Stephen Rothwell
Hi Kalle,

Today's linux-next merge of the wireless-drivers-next tree got a
conflict in drivers/net/wireless/iwlwifi/mvm/scan.c between commit
1f9c418fd94c ("iwlwifi: mvm: fix EBS on single scan") from the
wireless-drivers tree and commit a1ed4025765c ("iwlwifi: mvm: Configure
EBS scan ratio") from the wireless-drivers-next tree.

I fixed it up (the latter moved the code modified by the former, but
also seems to have included the fix from the former patch) and can
carry the fix as necessary (no action is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpbs6E6XZzVX.pgp
Description: OpenPGP digital signature


[PATCH tip 2/9] tracing: allow eBPF programs to call bpf_printk()

2015-01-15 Thread Alexei Starovoitov
Debugging of eBPF programs needs some form of printk from the program,
so let programs call limited trace_printk() with %d %u %x %p modifiers only.

Signed-off-by: Alexei Starovoitov 
---
 include/uapi/linux/bpf.h|1 +
 kernel/trace/bpf_trace.c|   61 +++
 kernel/trace/trace_events.c |8 ++
 3 files changed, 70 insertions(+)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 959538c50117..ef88e3f45b85 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -170,6 +170,7 @@ enum bpf_func_id {
BPF_FUNC_fetch_u8,/* u8 bpf_fetch_u8(void *unsafe_ptr) */
BPF_FUNC_memcmp,  /* int bpf_memcmp(void *unsafe_ptr, void 
*safe_ptr, int size) */
BPF_FUNC_dump_stack,  /* void bpf_dump_stack(void) */
+   BPF_FUNC_printk,  /* int bpf_printk(const char *fmt, int 
fmt_size, ...) */
__BPF_FUNC_MAX_ID,
 };
 
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 639d3c25dead..3825d7a3cbd1 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -60,6 +60,60 @@ static u64 bpf_dump_stack(u64 r1, u64 r2, u64 r3, u64 r4, 
u64 r5)
return 0;
 }
 
+/* limited printk()
+ * only %d %u %x %ld %lu %lx %lld %llu %llx %p conversion specifiers allowed
+ */
+static u64 bpf_printk(u64 r1, u64 fmt_size, u64 r3, u64 r4, u64 r5)
+{
+   char *fmt = (char *) (long) r1;
+   int fmt_cnt = 0;
+   bool mod_l[3] = {};
+   int i;
+
+   /* bpf_check() guarantees that fmt points to bpf program stack and
+* fmt_size bytes of it were initialized by bpf program
+*/
+   if (fmt[fmt_size - 1] != 0)
+   return -EINVAL;
+
+   /* check format string for allowed specifiers */
+   for (i = 0; i < fmt_size; i++)
+   if (fmt[i] == '%') {
+   if (fmt_cnt >= 3)
+   return -EINVAL;
+   i++;
+   if (i >= fmt_size)
+   return -EINVAL;
+
+   if (fmt[i] == 'l') {
+   mod_l[fmt_cnt] = true;
+   i++;
+   if (i >= fmt_size)
+   return -EINVAL;
+   } else if (fmt[i] == 'p') {
+   mod_l[fmt_cnt] = true;
+   fmt_cnt++;
+   continue;
+   }
+
+   if (fmt[i] == 'l') {
+   mod_l[fmt_cnt] = true;
+   i++;
+   if (i >= fmt_size)
+   return -EINVAL;
+   }
+
+   if (fmt[i] != 'd' && fmt[i] != 'u' && fmt[i] != 'x')
+   return -EINVAL;
+   fmt_cnt++;
+   }
+
+   return __trace_printk((unsigned long) __builtin_return_address(3), fmt,
+ mod_l[0] ? r3 : (u32) r3,
+ mod_l[1] ? r4 : (u32) r4,
+ mod_l[2] ? r5 : (u32) r5);
+}
+
 static struct bpf_func_proto tracing_filter_funcs[] = {
 #define FETCH(SIZE)\
[BPF_FUNC_fetch_##SIZE] = { \
@@ -86,6 +140,13 @@ static struct bpf_func_proto tracing_filter_funcs[] = {
.gpl_only = false,
.ret_type = RET_VOID,
},
+   [BPF_FUNC_printk] = {
+   .func = bpf_printk,
+   .gpl_only = true,
+   .ret_type = RET_INTEGER,
+   .arg1_type = ARG_PTR_TO_STACK,
+   .arg2_type = ARG_CONST_STACK_SIZE,
+   },
 };
 
 static const struct bpf_func_proto *tracing_filter_func_proto(enum bpf_func_id 
func_id)
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 189cc4d697b5..282ea5822480 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -1141,6 +1141,14 @@ event_filter_write(struct file *filp, const char __user 
*ubuf, size_t cnt,
 
mutex_unlock(_mutex);
 
+   if (file && file->flags & TRACE_EVENT_FL_BPF) {
+   /*
+* allocate per-cpu printk buffers, since programs
+* might be calling bpf_printk
+*/
+   trace_printk_init_buffers();
+   }
+
free_page((unsigned long) buf);
if (err < 0)
return err;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH tip 5/9] samples: bpf: simple tracing example in C

2015-01-15 Thread Alexei Starovoitov
tracex1_kern.c - C program which will be compiled into eBPF
to filter netif_receive_skb events on skb->dev->name == "lo"
The programs returns 1 to continue storing an event into trace buffer
and returns 0 - to discard an event.

tracex1_user.c - corresponding user space component that
forever reads /sys/.../trace_pipe

Usage:
$ sudo tracex1

should see:
writing bpf-4 -> /sys/kernel/debug/tracing/events/net/netif_receive_skb/filter
  ping-364   [000] ..s2 8.089771: netif_receive_skb: dev=lo 
skbaddr=88000dfcc100 len=84
  ping-364   [000] ..s2 8.089889: netif_receive_skb: dev=lo 
skbaddr=88000dfcc900 len=84

Ctrl-C at any time, kernel will auto cleanup

Signed-off-by: Alexei Starovoitov 
---
 samples/bpf/Makefile   |4 +++
 samples/bpf/bpf_helpers.h  |   18 ++
 samples/bpf/bpf_load.c |   59 +++-
 samples/bpf/bpf_load.h |3 +++
 samples/bpf/tracex1_kern.c |   28 +
 samples/bpf/tracex1_user.c |   24 ++
 6 files changed, 130 insertions(+), 6 deletions(-)
 create mode 100644 samples/bpf/tracex1_kern.c
 create mode 100644 samples/bpf/tracex1_user.c

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 789691374562..da28e1b6d3a6 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -7,6 +7,7 @@ hostprogs-y += sock_example
 hostprogs-y += sockex1
 hostprogs-y += sockex2
 hostprogs-y += dropmon
+hostprogs-y += tracex1
 
 dropmon-objs := dropmon.o libbpf.o
 test_verifier-objs := test_verifier.o libbpf.o
@@ -14,17 +15,20 @@ test_maps-objs := test_maps.o libbpf.o
 sock_example-objs := sock_example.o libbpf.o
 sockex1-objs := bpf_load.o libbpf.o sockex1_user.o
 sockex2-objs := bpf_load.o libbpf.o sockex2_user.o
+tracex1-objs := bpf_load.o libbpf.o tracex1_user.o
 
 # Tell kbuild to always build the programs
 always := $(hostprogs-y)
 always += sockex1_kern.o
 always += sockex2_kern.o
+always += tracex1_kern.o
 
 HOSTCFLAGS += -I$(objtree)/usr/include
 
 HOSTCFLAGS_bpf_load.o += -I$(objtree)/usr/include -Wno-unused-variable
 HOSTLOADLIBES_sockex1 += -lelf
 HOSTLOADLIBES_sockex2 += -lelf
+HOSTLOADLIBES_tracex1 += -lelf
 
 # point this to your LLVM backend with bpf support
 LLC=$(srctree)/tools/bpf/llvm/bld/Debug+Asserts/bin/llc
diff --git a/samples/bpf/bpf_helpers.h b/samples/bpf/bpf_helpers.h
index ca0333146006..81388e821eb3 100644
--- a/samples/bpf/bpf_helpers.h
+++ b/samples/bpf/bpf_helpers.h
@@ -15,6 +15,24 @@ static int (*bpf_map_update_elem)(void *map, void *key, void 
*value,
(void *) BPF_FUNC_map_update_elem;
 static int (*bpf_map_delete_elem)(void *map, void *key) =
(void *) BPF_FUNC_map_delete_elem;
+static void *(*bpf_fetch_ptr)(void *unsafe_ptr) =
+   (void *) BPF_FUNC_fetch_ptr;
+static unsigned long long (*bpf_fetch_u64)(void *unsafe_ptr) =
+   (void *) BPF_FUNC_fetch_u64;
+static unsigned int (*bpf_fetch_u32)(void *unsafe_ptr) =
+   (void *) BPF_FUNC_fetch_u32;
+static unsigned short (*bpf_fetch_u16)(void *unsafe_ptr) =
+   (void *) BPF_FUNC_fetch_u16;
+static unsigned char (*bpf_fetch_u8)(void *unsafe_ptr) =
+   (void *) BPF_FUNC_fetch_u8;
+static int (*bpf_printk)(const char *fmt, int fmt_size, ...) =
+   (void *) BPF_FUNC_printk;
+static int (*bpf_memcmp)(void *unsafe_ptr, void *safe_ptr, int size) =
+   (void *) BPF_FUNC_memcmp;
+static void (*bpf_dump_stack)(void) =
+   (void *) BPF_FUNC_dump_stack;
+static unsigned long long (*bpf_ktime_get_ns)(void) =
+   (void *) BPF_FUNC_ktime_get_ns;
 
 /* llvm builtin functions that eBPF C program may use to
  * emit BPF_LD_ABS and BPF_LD_IND instructions
diff --git a/samples/bpf/bpf_load.c b/samples/bpf/bpf_load.c
index 1831d236382b..788ac51c1024 100644
--- a/samples/bpf/bpf_load.c
+++ b/samples/bpf/bpf_load.c
@@ -14,6 +14,8 @@
 #include "bpf_helpers.h"
 #include "bpf_load.h"
 
+#define DEBUGFS "/sys/kernel/debug/tracing/"
+
 static char license[128];
 static bool processed_sec[128];
 int map_fd[MAX_MAPS];
@@ -22,15 +24,18 @@ int prog_cnt;
 
 static int load_and_attach(const char *event, struct bpf_insn *prog, int size)
 {
-   int fd;
bool is_socket = strncmp(event, "socket", 6) == 0;
+   enum bpf_prog_type prog_type;
+   char path[256] = DEBUGFS;
+   char fmt[32];
+   int fd, event_fd, err;
 
-   if (!is_socket)
-   /* tracing events tbd */
-   return -1;
+   if (is_socket)
+   prog_type = BPF_PROG_TYPE_SOCKET_FILTER;
+   else
+   prog_type = BPF_PROG_TYPE_TRACING_FILTER;
 
-   fd = bpf_prog_load(BPF_PROG_TYPE_SOCKET_FILTER,
-  prog, size, license);
+   fd = bpf_prog_load(prog_type, prog, size, license);
 
if (fd < 0) {
printf("bpf_prog_load() err=%d\n%s", errno, bpf_log_buf);
@@ -39,6 +44,28 @@ static int load_and_attach(const char *event, struct 
bpf_insn *prog, int size)
 
prog_fd[prog_cnt++] = fd;
 
+   if 

[PATCH tip 1/9] tracing: attach eBPF programs to tracepoints and syscalls

2015-01-15 Thread Alexei Starovoitov
User interface:
fd = open("/sys/kernel/debug/tracing/__event__/filter")

write(fd, "bpf_123")

where 123 is process local FD associated with eBPF program previously loaded.
__event__ is static tracepoint event or syscall.
(kprobe support is in next patch)
Once program is successfully attached to tracepoint event, the tracepoint
will be auto-enabled

close(fd)
auto-disables tracepoint event and detaches eBPF program from it

eBPF programs can call in-kernel helper functions to:
- lookup/update/delete elements in maps
- memcmp
- dump_stack
- fetch_ptr/u64/u32/u16/u8 values from unsafe address via probe_kernel_read(),
  so that eBPF program can walk any kernel data structures

Signed-off-by: Alexei Starovoitov 
---
 include/linux/ftrace_event.h   |4 ++
 include/trace/bpf_trace.h  |   25 +++
 include/trace/ftrace.h |   30 
 include/uapi/linux/bpf.h   |8 +++
 kernel/trace/Kconfig   |1 +
 kernel/trace/Makefile  |1 +
 kernel/trace/bpf_trace.c   |  140 
 kernel/trace/trace.h   |3 +
 kernel/trace/trace_events.c|   33 -
 kernel/trace/trace_events_filter.c |   76 +++-
 kernel/trace/trace_syscalls.c  |   31 
 11 files changed, 350 insertions(+), 2 deletions(-)
 create mode 100644 include/trace/bpf_trace.h
 create mode 100644 kernel/trace/bpf_trace.c

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index d36f68b08acc..a3897f5e43ca 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -248,6 +248,7 @@ enum {
TRACE_EVENT_FL_WAS_ENABLED_BIT,
TRACE_EVENT_FL_USE_CALL_FILTER_BIT,
TRACE_EVENT_FL_TRACEPOINT_BIT,
+   TRACE_EVENT_FL_BPF_BIT,
 };
 
 /*
@@ -270,6 +271,7 @@ enum {
TRACE_EVENT_FL_WAS_ENABLED  = (1 << TRACE_EVENT_FL_WAS_ENABLED_BIT),
TRACE_EVENT_FL_USE_CALL_FILTER  = (1 << 
TRACE_EVENT_FL_USE_CALL_FILTER_BIT),
TRACE_EVENT_FL_TRACEPOINT   = (1 << TRACE_EVENT_FL_TRACEPOINT_BIT),
+   TRACE_EVENT_FL_BPF  = (1 << TRACE_EVENT_FL_BPF_BIT),
 };
 
 struct ftrace_event_call {
@@ -544,6 +546,8 @@ event_trigger_unlock_commit_regs(struct ftrace_event_file 
*file,
event_triggers_post_call(file, tt);
 }
 
+unsigned int trace_filter_call_bpf(struct event_filter *filter, void *ctx);
+
 enum {
FILTER_OTHER = 0,
FILTER_STATIC_STRING,
diff --git a/include/trace/bpf_trace.h b/include/trace/bpf_trace.h
new file mode 100644
index ..4e64f61f484d
--- /dev/null
+++ b/include/trace/bpf_trace.h
@@ -0,0 +1,25 @@
+/* Copyright (c) 2011-2015 PLUMgrid, http://plumgrid.com
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ */
+#ifndef _LINUX_KERNEL_BPF_TRACE_H
+#define _LINUX_KERNEL_BPF_TRACE_H
+
+/* For tracepoint filters argN fields match one to one to arguments
+ * passed to tracepoint events
+ *
+ * For syscall entry filters argN fields match syscall arguments
+ * For syscall exit filters arg1 is a return value
+ */
+struct bpf_context {
+   u64 arg1;
+   u64 arg2;
+   u64 arg3;
+   u64 arg4;
+   u64 arg5;
+   u64 arg6;
+};
+
+#endif /* _LINUX_KERNEL_BPF_TRACE_H */
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 27609dfcce25..7b2cf74a9b08 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -17,6 +17,7 @@
  */
 
 #include 
+#include 
 
 /*
  * DECLARE_EVENT_CLASS can be used to add a generic function
@@ -617,6 +618,24 @@ static inline notrace int ftrace_get_offsets_##call(   
\
 #undef __perf_task
 #define __perf_task(t) (t)
 
+/* zero extend integer, pointer or aggregate type to u64 without warnings */
+#define __CAST_TO_U64(expr) ({ \
+   u64 ret = 0; \
+   switch (sizeof(expr)) { \
+   case 8: ret = *(u64 *)  break; \
+   case 4: ret = *(u32 *)  break; \
+   case 2: ret = *(u16 *)  break; \
+   case 1: ret = *(u8 *)  break; \
+   } \
+   ret; })
+
+#define __BPF_CAST1(a,...) __CAST_TO_U64(a)
+#define __BPF_CAST2(a,...) __CAST_TO_U64(a), __BPF_CAST1(__VA_ARGS__)
+#define __BPF_CAST3(a,...) __CAST_TO_U64(a), __BPF_CAST2(__VA_ARGS__)
+#define __BPF_CAST4(a,...) __CAST_TO_U64(a), __BPF_CAST3(__VA_ARGS__)
+#define __BPF_CAST5(a,...) __CAST_TO_U64(a), __BPF_CAST4(__VA_ARGS__)
+#define __BPF_CAST6(a,...) __CAST_TO_U64(a), __BPF_CAST5(__VA_ARGS__)
+
 #undef DECLARE_EVENT_CLASS
 #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \
\
@@ -632,6 +651,17 @@ ftrace_raw_event_##call(void *__data, proto)   
\
if (ftrace_trigger_soft_disabled(ftrace_file))  \
return;   

[PATCH tip 6/9] samples: bpf: counting example for kfree_skb tracepoint and write syscall

2015-01-15 Thread Alexei Starovoitov
this example has two probes in one C file that attach to different tracepoints
and use two different maps.

1st probe is the similar to dropmon.c. It attaches to kfree_skb tracepoint and
count number of packet drops at different locations

2nd probe attaches to syscalls/sys_enter_write and computes a histogram of 
different
write sizes

Usage:
$ sudo tracex2
writing bpf-8 -> /sys/kernel/debug/tracing/events/skb/kfree_skb/filter
writing bpf-10 -> 
/sys/kernel/debug/tracing/events/syscalls/sys_enter_write/filter
location 0x816959a5 count 1

location 0x816959a5 count 2

557145+0 records in
557145+0 records out
285258240 bytes (285 MB) copied, 1.02379 s, 279 MB/s
   syscall write() stats
 byte_size   : count distribution
   1 -> 1: 3|  |
   2 -> 3: 0|  |
   4 -> 7: 0|  |
   8 -> 15   : 0|  |
  16 -> 31   : 2|  |
  32 -> 63   : 3|  |
  64 -> 127  : 1|  |
 128 -> 255  : 1|  |
 256 -> 511  : 0|  |
 512 -> 1023 : 1118968  |* |

Ctrl-C at any time. Kernel will auto cleanup maps and programs

Signed-off-by: Alexei Starovoitov 
---
 samples/bpf/Makefile   |4 ++
 samples/bpf/tracex2_kern.c |   71 +
 samples/bpf/tracex2_user.c |   95 
 3 files changed, 170 insertions(+)
 create mode 100644 samples/bpf/tracex2_kern.c
 create mode 100644 samples/bpf/tracex2_user.c

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index da28e1b6d3a6..416af24b01fd 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -8,6 +8,7 @@ hostprogs-y += sockex1
 hostprogs-y += sockex2
 hostprogs-y += dropmon
 hostprogs-y += tracex1
+hostprogs-y += tracex2
 
 dropmon-objs := dropmon.o libbpf.o
 test_verifier-objs := test_verifier.o libbpf.o
@@ -16,12 +17,14 @@ sock_example-objs := sock_example.o libbpf.o
 sockex1-objs := bpf_load.o libbpf.o sockex1_user.o
 sockex2-objs := bpf_load.o libbpf.o sockex2_user.o
 tracex1-objs := bpf_load.o libbpf.o tracex1_user.o
+tracex2-objs := bpf_load.o libbpf.o tracex2_user.o
 
 # Tell kbuild to always build the programs
 always := $(hostprogs-y)
 always += sockex1_kern.o
 always += sockex2_kern.o
 always += tracex1_kern.o
+always += tracex2_kern.o
 
 HOSTCFLAGS += -I$(objtree)/usr/include
 
@@ -29,6 +32,7 @@ HOSTCFLAGS_bpf_load.o += -I$(objtree)/usr/include 
-Wno-unused-variable
 HOSTLOADLIBES_sockex1 += -lelf
 HOSTLOADLIBES_sockex2 += -lelf
 HOSTLOADLIBES_tracex1 += -lelf
+HOSTLOADLIBES_tracex2 += -lelf
 
 # point this to your LLVM backend with bpf support
 LLC=$(srctree)/tools/bpf/llvm/bld/Debug+Asserts/bin/llc
diff --git a/samples/bpf/tracex2_kern.c b/samples/bpf/tracex2_kern.c
new file mode 100644
index ..a789c456c1b4
--- /dev/null
+++ b/samples/bpf/tracex2_kern.c
@@ -0,0 +1,71 @@
+#include 
+#include 
+#include 
+#include 
+#include "bpf_helpers.h"
+
+struct bpf_map_def SEC("maps") my_map = {
+   .type = BPF_MAP_TYPE_HASH,
+   .key_size = sizeof(long),
+   .value_size = sizeof(long),
+   .max_entries = 1024,
+};
+
+SEC("events/skb/kfree_skb")
+int bpf_prog2(struct bpf_context *ctx)
+{
+   long loc = ctx->arg2;
+   long init_val = 1;
+   long *value;
+
+   value = bpf_map_lookup_elem(_map, );
+   if (value)
+   *value += 1;
+   else
+   bpf_map_update_elem(_map, , _val, BPF_ANY);
+   return 0;
+}
+
+static unsigned int log2(unsigned int v)
+{
+   unsigned int r;
+   unsigned int shift;
+
+   r = (v > 0x) << 4; v >>= r;
+   shift = (v > 0xFF) << 3; v >>= shift; r |= shift;
+   shift = (v > 0xF) << 2; v >>= shift; r |= shift;
+   shift = (v > 0x3) << 1; v >>= shift; r |= shift;
+   r |= (v >> 1);
+   return r;
+}
+
+static unsigned int log2l(unsigned long v)
+{
+   unsigned int hi = v >> 32;
+   if (hi)
+   return log2(hi) + 32;
+   else
+   return log2(v);
+}
+
+struct bpf_map_def SEC("maps") my_hist_map = {
+   .type = BPF_MAP_TYPE_ARRAY,
+   .key_size = sizeof(u32),
+   .value_size = sizeof(long),
+   .max_entries = 64,
+};
+
+SEC("events/syscalls/sys_enter_write")
+int bpf_prog3(struct bpf_context *ctx)
+{
+   long write_size = ctx->arg3;
+   long init_val = 1;
+   long *value;
+   u32 index = log2l(write_size);
+
+   value = bpf_map_lookup_elem(_hist_map, );
+   if (value)
+   __sync_fetch_and_add(value, 1);
+   return 0;
+}
+char _license[] 

[PATCH tip 4/9] samples: bpf: simple tracing example in eBPF assembler

2015-01-15 Thread Alexei Starovoitov
simple packet drop monitor:
- in-kernel eBPF program attaches to kfree_skb() event and records number
  of packet drops at given location
- userspace iterates over the map every second and prints stats

Usage:
$ sudo dropmon
location 0x81695995 count 1
location 0x816d0da9 count 2

location 0x81695995 count 2
location 0x816d0da9 count 2

location 0x81695995 count 3
location 0x816d0da9 count 2

$ addr2line -ape ./bld_x64/vmlinux 0x81695995 0x816d0da9
0x81695995: ./bld_x64/../net/ipv4/icmp.c:1038
0x816d0da9: ./bld_x64/../net/unix/af_unix.c:1231

Signed-off-by: Alexei Starovoitov 
---
 samples/bpf/Makefile  |2 +
 samples/bpf/dropmon.c |  129 +
 2 files changed, 131 insertions(+)
 create mode 100644 samples/bpf/dropmon.c

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index b5b3600dcdf5..789691374562 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -6,7 +6,9 @@ hostprogs-y := test_verifier test_maps
 hostprogs-y += sock_example
 hostprogs-y += sockex1
 hostprogs-y += sockex2
+hostprogs-y += dropmon
 
+dropmon-objs := dropmon.o libbpf.o
 test_verifier-objs := test_verifier.o libbpf.o
 test_maps-objs := test_maps.o libbpf.o
 sock_example-objs := sock_example.o libbpf.o
diff --git a/samples/bpf/dropmon.c b/samples/bpf/dropmon.c
new file mode 100644
index ..9a2cd3344d69
--- /dev/null
+++ b/samples/bpf/dropmon.c
@@ -0,0 +1,129 @@
+/* simple packet drop monitor:
+ * - in-kernel eBPF program attaches to kfree_skb() event and records number
+ *   of packet drops at given location
+ * - userspace iterates over the map every second and prints stats
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "libbpf.h"
+
+#define TRACEPOINT "/sys/kernel/debug/tracing/events/skb/kfree_skb/"
+
+static int write_to_file(const char *file, const char *str, bool keep_open)
+{
+   int fd, err;
+
+   fd = open(file, O_WRONLY);
+   err = write(fd, str, strlen(str));
+   (void) err;
+
+   if (keep_open) {
+   return fd;
+   } else {
+   close(fd);
+   return -1;
+   }
+}
+
+static int dropmon(void)
+{
+   long long key, next_key, value = 0;
+   int prog_fd, map_fd, i;
+   char fmt[32];
+
+   map_fd = bpf_create_map(BPF_MAP_TYPE_HASH, sizeof(key), sizeof(value), 
1024);
+   if (map_fd < 0) {
+   printf("failed to create map '%s'\n", strerror(errno));
+   goto cleanup;
+   }
+
+   /* the following eBPF program is equivalent to C:
+* int filter(struct bpf_context *ctx)
+* {
+*   long loc = ctx->arg2;
+*   long init_val = 1;
+*   long *value;
+*
+*   value = bpf_map_lookup_elem(MAP_ID, );
+*   if (value) {
+*  __sync_fetch_and_add(value, 1);
+*   } else {
+*  bpf_map_update_elem(MAP_ID, , _val, BPF_ANY);
+*   }
+*   return 0;
+* }
+*/
+   struct bpf_insn prog[] = {
+   BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_1, 8), /* r2 = *(u64 
*)(r1 + 8) */
+   BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_2, -8), /* *(u64 *)(fp 
- 8) = r2 */
+   BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+   BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8), /* r2 = fp - 8 */
+   BPF_LD_MAP_FD(BPF_REG_1, map_fd),
+   BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, 
BPF_FUNC_map_lookup_elem),
+   BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 4),
+   BPF_MOV64_IMM(BPF_REG_1, 1), /* r1 = 1 */
+   BPF_RAW_INSN(BPF_STX | BPF_XADD | BPF_DW, BPF_REG_0, BPF_REG_1, 
0, 0), /* xadd r0 += r1 */
+   BPF_MOV64_IMM(BPF_REG_0, 0), /* r0 = 0 */
+   BPF_EXIT_INSN(),
+   BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, 1), /* *(u64 *)(fp - 16) = 
1 */
+   BPF_MOV64_IMM(BPF_REG_4, BPF_ANY),
+   BPF_MOV64_REG(BPF_REG_3, BPF_REG_10),
+   BPF_ALU64_IMM(BPF_ADD, BPF_REG_3, -16), /* r3 = fp - 16 */
+   BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+   BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8), /* r2 = fp - 8 */
+   BPF_LD_MAP_FD(BPF_REG_1, map_fd),
+   BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, 
BPF_FUNC_map_update_elem),
+   BPF_MOV64_IMM(BPF_REG_0, 0), /* r0 = 0 */
+   BPF_EXIT_INSN(),
+   };
+
+   prog_fd = bpf_prog_load(BPF_PROG_TYPE_TRACING_FILTER, prog,
+   sizeof(prog), "GPL");
+   if (prog_fd < 0) {
+   printf("failed to load prog '%s'\n%s", strerror(errno), 
bpf_log_buf);
+   return -1;
+   }
+
+   sprintf(fmt, "bpf_%d", prog_fd);
+
+   write_to_file(TRACEPOINT "filter", fmt, true);
+
+   for (i = 0; i < 10; i++) {
+ 

[PATCH tip 7/9] samples: bpf: IO latency analysis (iosnoop/heatmap)

2015-01-15 Thread Alexei Starovoitov
eBPF C program attaches to block_rq_issue/block_rq_complete events to calculate
IO latency. Then it waits for the first 100 events to compute average latency
and uses range [0 .. ave_lat * 2] to record histogram of events in this latency
range.
User space reads this histogram map every 2 seconds and prints it as a 'heatmap'
using gray shades of text terminal. Black spaces have many events and white
spaces have very few events. Left most space is the smallest latency, right most
space is the largest latency in the range.
If kernel sees too many events that fall out of histogram range, user space
adjusts the range up, so heatmap for next 2 seconds will be more accurate.

Usage:
$ sudo ./tracex3
and do 'sudo dd if=/dev/sda of=/dev/null' in other terminal.
Observe IO latencies and how different activity (like 'make kernel') affects it.

Similar experiments can be done for network transmit latencies, syscalls, etc

Signed-off-by: Alexei Starovoitov 
---
 samples/bpf/Makefile   |4 ++
 samples/bpf/tracex3_kern.c |   96 +
 samples/bpf/tracex3_user.c |  146 
 3 files changed, 246 insertions(+)
 create mode 100644 samples/bpf/tracex3_kern.c
 create mode 100644 samples/bpf/tracex3_user.c

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 416af24b01fd..da0efd8032ab 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -9,6 +9,7 @@ hostprogs-y += sockex2
 hostprogs-y += dropmon
 hostprogs-y += tracex1
 hostprogs-y += tracex2
+hostprogs-y += tracex3
 
 dropmon-objs := dropmon.o libbpf.o
 test_verifier-objs := test_verifier.o libbpf.o
@@ -18,6 +19,7 @@ sockex1-objs := bpf_load.o libbpf.o sockex1_user.o
 sockex2-objs := bpf_load.o libbpf.o sockex2_user.o
 tracex1-objs := bpf_load.o libbpf.o tracex1_user.o
 tracex2-objs := bpf_load.o libbpf.o tracex2_user.o
+tracex3-objs := bpf_load.o libbpf.o tracex3_user.o
 
 # Tell kbuild to always build the programs
 always := $(hostprogs-y)
@@ -25,6 +27,7 @@ always += sockex1_kern.o
 always += sockex2_kern.o
 always += tracex1_kern.o
 always += tracex2_kern.o
+always += tracex3_kern.o
 
 HOSTCFLAGS += -I$(objtree)/usr/include
 
@@ -33,6 +36,7 @@ HOSTLOADLIBES_sockex1 += -lelf
 HOSTLOADLIBES_sockex2 += -lelf
 HOSTLOADLIBES_tracex1 += -lelf
 HOSTLOADLIBES_tracex2 += -lelf
+HOSTLOADLIBES_tracex3 += -lelf
 
 # point this to your LLVM backend with bpf support
 LLC=$(srctree)/tools/bpf/llvm/bld/Debug+Asserts/bin/llc
diff --git a/samples/bpf/tracex3_kern.c b/samples/bpf/tracex3_kern.c
new file mode 100644
index ..fa04603b80b8
--- /dev/null
+++ b/samples/bpf/tracex3_kern.c
@@ -0,0 +1,96 @@
+#include 
+#include 
+#include 
+#include 
+#include "bpf_helpers.h"
+
+struct bpf_map_def SEC("maps") my_map = {
+   .type = BPF_MAP_TYPE_HASH,
+   .key_size = sizeof(long),
+   .value_size = sizeof(u64),
+   .max_entries = 4096,
+};
+
+SEC("events/block/block_rq_issue")
+int bpf_prog1(struct bpf_context *ctx)
+{
+   long rq = ctx->arg2;
+   u64 val = bpf_ktime_get_ns();
+
+   bpf_map_update_elem(_map, , , BPF_ANY);
+   return 0;
+}
+
+struct globals {
+   u64 lat_ave;
+   u64 lat_sum;
+   u64 missed;
+   u64 max_lat;
+   int num_samples;
+};
+
+struct bpf_map_def SEC("maps") global_map = {
+   .type = BPF_MAP_TYPE_ARRAY,
+   .key_size = sizeof(int),
+   .value_size = sizeof(struct globals),
+   .max_entries = 1,
+};
+
+#define MAX_SLOT 32
+
+struct bpf_map_def SEC("maps") lat_map = {
+   .type = BPF_MAP_TYPE_ARRAY,
+   .key_size = sizeof(int),
+   .value_size = sizeof(u64),
+   .max_entries = MAX_SLOT,
+};
+
+SEC("events/block/block_rq_complete")
+int bpf_prog2(struct bpf_context *ctx)
+{
+   long rq = ctx->arg2;
+   void *value;
+
+   value = bpf_map_lookup_elem(_map, );
+   if (!value)
+   return 0;
+
+   u64 cur_time = bpf_ktime_get_ns();
+   u64 delta = (cur_time - *(u64 *)value) / 1000;
+
+   bpf_map_delete_elem(_map, );
+
+   int ind = 0;
+   struct globals *g = bpf_map_lookup_elem(_map, );
+   if (!g)
+   return 0;
+   if (g->lat_ave == 0) {
+   g->num_samples++;
+   g->lat_sum += delta;
+   if (g->num_samples >= 100) {
+   g->lat_ave = g->lat_sum / g->num_samples;
+   if (0/* debug */) {
+   char fmt[] = "after %d samples average latency 
%ld usec\n";
+   bpf_printk(fmt, sizeof(fmt), g->num_samples,
+  g->lat_ave);
+   }
+   }
+   } else {
+   u64 max_lat = g->lat_ave * 2;
+   if (delta > max_lat) {
+   g->missed++;
+   if (delta > g->max_lat)
+   g->max_lat = delta;
+   return 0;
+   }
+
+   ind 

[PATCH tip 9/9] samples: bpf: simple kprobe example

2015-01-15 Thread Alexei Starovoitov
the logic of the example is similar to tracex2, but syscall 'write' statistics
is capturead from kprobe placed at sys_write function instead of through
syscall instrumentation.
Also tracex4_kern.c has a different way of doing log2 in C.
Note, unlike tracepoint and syscall programs, kprobe programs receive
'struct pt_regs' as an input. It's responsibility of the program author
or higher level dynamic tracing tool to match registers to function arguments.
Since pt_regs are architecture dependent, programs are also arch dependent,
unlike tracepoint/syscalls programs which are universal.

Usage:
$ sudo tracex4
writing bpf-6 -> /sys/kernel/debug/tracing/events/kprobes/sys_write/filter
2216443+0 records in
2216442+0 records out
1134818304 bytes (1.1 GB) copied, 2.00746 s, 565 MB/s

   kprobe sys_write() stats
 byte_size   : count distribution
   1 -> 1: 0|  |
   2 -> 3: 0|  |
   4 -> 7: 0|  |
   8 -> 15   : 0|  |
  16 -> 31   : 0|  |
  32 -> 63   : 0|  |
  64 -> 127  : 1|  |
 128 -> 255  : 0|  |
 256 -> 511  : 0|  |
 512 -> 1023 : 2214734  |* |

Signed-off-by: Alexei Starovoitov 
---
 samples/bpf/Makefile   |4 +++
 samples/bpf/bpf_load.c |3 ++
 samples/bpf/tracex4_kern.c |   36 +++
 samples/bpf/tracex4_user.c |   83 
 4 files changed, 126 insertions(+)
 create mode 100644 samples/bpf/tracex4_kern.c
 create mode 100644 samples/bpf/tracex4_user.c

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index da0efd8032ab..22c7a38f3f95 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -10,6 +10,7 @@ hostprogs-y += dropmon
 hostprogs-y += tracex1
 hostprogs-y += tracex2
 hostprogs-y += tracex3
+hostprogs-y += tracex4
 
 dropmon-objs := dropmon.o libbpf.o
 test_verifier-objs := test_verifier.o libbpf.o
@@ -20,6 +21,7 @@ sockex2-objs := bpf_load.o libbpf.o sockex2_user.o
 tracex1-objs := bpf_load.o libbpf.o tracex1_user.o
 tracex2-objs := bpf_load.o libbpf.o tracex2_user.o
 tracex3-objs := bpf_load.o libbpf.o tracex3_user.o
+tracex4-objs := bpf_load.o libbpf.o tracex4_user.o
 
 # Tell kbuild to always build the programs
 always := $(hostprogs-y)
@@ -28,6 +30,7 @@ always += sockex2_kern.o
 always += tracex1_kern.o
 always += tracex2_kern.o
 always += tracex3_kern.o
+always += tracex4_kern.o
 
 HOSTCFLAGS += -I$(objtree)/usr/include
 
@@ -37,6 +40,7 @@ HOSTLOADLIBES_sockex2 += -lelf
 HOSTLOADLIBES_tracex1 += -lelf
 HOSTLOADLIBES_tracex2 += -lelf
 HOSTLOADLIBES_tracex3 += -lelf
+HOSTLOADLIBES_tracex4 += -lelf
 
 # point this to your LLVM backend with bpf support
 LLC=$(srctree)/tools/bpf/llvm/bld/Debug+Asserts/bin/llc
diff --git a/samples/bpf/bpf_load.c b/samples/bpf/bpf_load.c
index 788ac51c1024..d8c5176f0564 100644
--- a/samples/bpf/bpf_load.c
+++ b/samples/bpf/bpf_load.c
@@ -25,6 +25,7 @@ int prog_cnt;
 static int load_and_attach(const char *event, struct bpf_insn *prog, int size)
 {
bool is_socket = strncmp(event, "socket", 6) == 0;
+   bool is_kprobe = strncmp(event, "events/kprobes/", 15) == 0;
enum bpf_prog_type prog_type;
char path[256] = DEBUGFS;
char fmt[32];
@@ -32,6 +33,8 @@ static int load_and_attach(const char *event, struct bpf_insn 
*prog, int size)
 
if (is_socket)
prog_type = BPF_PROG_TYPE_SOCKET_FILTER;
+   else if (is_kprobe)
+   prog_type = BPF_PROG_TYPE_KPROBE_FILTER;
else
prog_type = BPF_PROG_TYPE_TRACING_FILTER;
 
diff --git a/samples/bpf/tracex4_kern.c b/samples/bpf/tracex4_kern.c
new file mode 100644
index ..9646f9e43417
--- /dev/null
+++ b/samples/bpf/tracex4_kern.c
@@ -0,0 +1,36 @@
+#include 
+#include 
+#include 
+#include 
+#include "bpf_helpers.h"
+
+static unsigned int log2l(unsigned long long n)
+{
+#define S(k) if (n >= (1ull << k)) { i += k; n >>= k; }
+   int i = -(n == 0);
+   S(32); S(16); S(8); S(4); S(2); S(1);
+   return i;
+#undef S
+}
+
+struct bpf_map_def SEC("maps") my_hist_map = {
+   .type = BPF_MAP_TYPE_ARRAY,
+   .key_size = sizeof(u32),
+   .value_size = sizeof(long),
+   .max_entries = 64,
+};
+
+SEC("events/kprobes/sys_write")
+int bpf_prog4(struct pt_regs *regs)
+{
+   long write_size = regs->dx; /* $rdx contains 3rd argument to a function 
*/
+   long init_val = 1;
+   void *value;
+   u32 index = log2l(write_size);
+
+   value = bpf_map_lookup_elem(_hist_map, );
+   if (value)
+ 

[PATCH tip 3/9] tracing: allow eBPF programs to call ktime_get_ns()

2015-01-15 Thread Alexei Starovoitov
bpf_ktime_get_ns() is used by programs to compue time delta between events
or as a timestamp

Signed-off-by: Alexei Starovoitov 
---
 include/uapi/linux/bpf.h |1 +
 kernel/trace/bpf_trace.c |   10 ++
 2 files changed, 11 insertions(+)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index ef88e3f45b85..6075c4f4b67e 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -171,6 +171,7 @@ enum bpf_func_id {
BPF_FUNC_memcmp,  /* int bpf_memcmp(void *unsafe_ptr, void 
*safe_ptr, int size) */
BPF_FUNC_dump_stack,  /* void bpf_dump_stack(void) */
BPF_FUNC_printk,  /* int bpf_printk(const char *fmt, int 
fmt_size, ...) */
+   BPF_FUNC_ktime_get_ns,/* u64 bpf_ktime_get_ns(void) */
__BPF_FUNC_MAX_ID,
 };
 
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 3825d7a3cbd1..14cfbbcec32e 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -114,6 +114,11 @@ static u64 bpf_printk(u64 r1, u64 fmt_size, u64 r3, u64 
r4, u64 r5)
  mod_l[2] ? r5 : (u32) r5);
 }
 
+static u64 bpf_ktime_get_ns(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5)
+{
+   return ktime_get_ns();
+}
+
 static struct bpf_func_proto tracing_filter_funcs[] = {
 #define FETCH(SIZE)\
[BPF_FUNC_fetch_##SIZE] = { \
@@ -147,6 +152,11 @@ static struct bpf_func_proto tracing_filter_funcs[] = {
.arg1_type = ARG_PTR_TO_STACK,
.arg2_type = ARG_CONST_STACK_SIZE,
},
+   [BPF_FUNC_ktime_get_ns] = {
+   .func = bpf_ktime_get_ns,
+   .gpl_only = true,
+   .ret_type = RET_INTEGER,
+   },
 };
 
 static const struct bpf_func_proto *tracing_filter_func_proto(enum bpf_func_id 
func_id)
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH tip 8/9] tracing: attach eBPF programs to kprobe/kretprobe

2015-01-15 Thread Alexei Starovoitov
introduce new type of eBPF programs BPF_PROG_TYPE_KPROBE_FILTER.
Such programs are allowed to call the same helper functions
as tracing filters, but bpf_context is different:
For tracing filters bpf_context is 6 arguments of tracepoints or syscalls
For kprobe filters bpf_context == pt_regs

Signed-off-by: Alexei Starovoitov 
---
 include/linux/ftrace_event.h   |2 ++
 include/uapi/linux/bpf.h   |1 +
 kernel/trace/bpf_trace.c   |   39 
 kernel/trace/trace_events_filter.c |   10 ++---
 kernel/trace/trace_kprobe.c|   11 +-
 5 files changed, 59 insertions(+), 4 deletions(-)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index a3897f5e43ca..0f1a0418bef7 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -249,6 +249,7 @@ enum {
TRACE_EVENT_FL_USE_CALL_FILTER_BIT,
TRACE_EVENT_FL_TRACEPOINT_BIT,
TRACE_EVENT_FL_BPF_BIT,
+   TRACE_EVENT_FL_KPROBE_BIT,
 };
 
 /*
@@ -272,6 +273,7 @@ enum {
TRACE_EVENT_FL_USE_CALL_FILTER  = (1 << 
TRACE_EVENT_FL_USE_CALL_FILTER_BIT),
TRACE_EVENT_FL_TRACEPOINT   = (1 << TRACE_EVENT_FL_TRACEPOINT_BIT),
TRACE_EVENT_FL_BPF  = (1 << TRACE_EVENT_FL_BPF_BIT),
+   TRACE_EVENT_FL_KPROBE   = (1 << TRACE_EVENT_FL_KPROBE_BIT),
 };
 
 struct ftrace_event_call {
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 6075c4f4b67e..79ca0c63ffaf 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -119,6 +119,7 @@ enum bpf_prog_type {
BPF_PROG_TYPE_UNSPEC,
BPF_PROG_TYPE_SOCKET_FILTER,
BPF_PROG_TYPE_TRACING_FILTER,
+   BPF_PROG_TYPE_KPROBE_FILTER,
 };
 
 /* flags for BPF_MAP_UPDATE_ELEM command */
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 14cfbbcec32e..c485c7cc8d57 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -209,3 +209,42 @@ static int __init register_tracing_filter_ops(void)
return 0;
 }
 late_initcall(register_tracing_filter_ops);
+
+/* check access to fields of 'struct pt_regs' from BPF program */
+static bool kprobe_filter_is_valid_access(int off, int size, enum 
bpf_access_type type)
+{
+   /* check bounds */
+   if (off < 0 || off >= sizeof(struct pt_regs))
+   return false;
+
+   /* only read is allowed */
+   if (type != BPF_READ)
+   return false;
+
+   /* disallow misaligned access */
+   if (off % size != 0)
+   return false;
+
+   return true;
+}
+/* kprobe filter programs are allowed to call the same helper functions
+ * as tracing filters, but bpf_context is different:
+ * For tracing filters bpf_context is 6 arguments of tracepoints or syscalls
+ * For kprobe filters bpf_context == pt_regs
+ */
+static struct bpf_verifier_ops kprobe_filter_ops = {
+   .get_func_proto = tracing_filter_func_proto,
+   .is_valid_access = kprobe_filter_is_valid_access,
+};
+
+static struct bpf_prog_type_list kprobe_tl = {
+   .ops = _filter_ops,
+   .type = BPF_PROG_TYPE_KPROBE_FILTER,
+};
+
+static int __init register_kprobe_filter_ops(void)
+{
+   bpf_register_prog_type(_tl);
+   return 0;
+}
+late_initcall(register_kprobe_filter_ops);
diff --git a/kernel/trace/trace_events_filter.c 
b/kernel/trace/trace_events_filter.c
index bb0140414238..75b7e93b2d28 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -1891,7 +1891,8 @@ static int create_filter_start(char *filter_str, bool 
set_str,
return err;
 }
 
-static int create_filter_bpf(char *filter_str, struct event_filter **filterp)
+static int create_filter_bpf(struct ftrace_event_call *call, char *filter_str,
+struct event_filter **filterp)
 {
struct event_filter *filter;
struct bpf_prog *prog;
@@ -1920,7 +1921,10 @@ static int create_filter_bpf(char *filter_str, struct 
event_filter **filterp)
 
filter->prog = prog;
 
-   if (prog->aux->prog_type != BPF_PROG_TYPE_TRACING_FILTER) {
+   if (((call->flags & TRACE_EVENT_FL_KPROBE) &&
+prog->aux->prog_type != BPF_PROG_TYPE_KPROBE_FILTER) ||
+   (!(call->flags & TRACE_EVENT_FL_KPROBE) &&
+prog->aux->prog_type != BPF_PROG_TYPE_TRACING_FILTER)) {
/* valid fd, but invalid bpf program type */
err = -EINVAL;
goto free_filter;
@@ -2051,7 +2055,7 @@ int apply_event_filter(struct ftrace_event_file *file, 
char *filter_string)
 */
if (memcmp(filter_string, "bpf", 3) == 0 && filter_string[3] != 0 &&
filter_string[4] != 0) {
-   err = create_filter_bpf(filter_string, );
+   err = create_filter_bpf(call, filter_string, );
if (!err)
file->flags |= TRACE_EVENT_FL_BPF;
} else {
diff --git a/kernel/trace/trace_kprobe.c 

[PATCH tip 0/9] tracing: attach eBPF programs to tracepoints/syscalls/kprobe

2015-01-15 Thread Alexei Starovoitov
Hi Ingo, Steven,

This patch set is based on tip/master.
It adds ability to attach eBPF programs to tracepoints, syscalls and kprobes.

Mechanism of attaching:
- load program via bpf() syscall and receive program_fd
- event_fd = open("/sys/kernel/debug/tracing/events/.../filter")
- write 'bpf-123' to event_fd where 123 is program_fd
- program will be attached to particular event and event automatically enabled
- close(event_fd) will detach bpf program from event and event disabled

Program attach point and input arguments:
- programs attached to kprobes receive 'struct pt_regs *' as an input.
  See tracex4_kern.c that demonstrates how users can write a C program like:
  SEC("events/kprobes/sys_write")
  int bpf_prog4(struct pt_regs *regs)
  {
 long write_size = regs->dx; 
 // here user need to know the proto of sys_write() from kernel
 // sources and x64 calling convention to know that register $rdx
 // contains 3rd argument to sys_write() which is 'size_t count'

  it's obviously architecture dependent, but allows building sophisticated
  user tools on top, that can see from debug info of vmlinux which variables
  are in which registers or stack locations and fetch it from there.
  'perf probe' can potentialy use this hook to generate programs in user space
  and insert them instead of letting kernel parse string during kprobe creation.

- programs attached to tracepoints and syscalls receive 'struct bpf_context *':
  u64 arg1, arg2, ..., arg6;
  for syscalls they match syscall arguments.
  for tracepoints these args match arguments passed to tracepoint.
  For example:
  trace_sched_migrate_task(p, new_cpu); from sched/core.c
  arg1 <- pwhich is 'struct task_struct *'
  arg2 <- new_cpu  which is 'unsigned int'
  arg3..arg6 = 0
  the program can use bpf_fetch_u8/16/32/64/ptr() helpers to walk 'task_struct'
  or any other kernel data structures.
  These helpers are using probe_kernel_read() similar to 'perf probe' which is
  not 100% safe in both cases, but good enough.
  To access task_struct's pid inside 'sched_migrate_task' tracepoint
  the program can do:
  struct task_struct *task = (struct task_struct *)ctx->arg1;
  u32 pid = bpf_fetch_u32(>pid);
  Since struct layout is kernel configuration specific such programs are not
  portable and require access to kernel headers to be compiled,
  but in this case we don't need debug info.
  llvm with bpf backend will statically compute task->pid offset as a constant
  based on kernel headers only.
  The example of this arbitrary pointer walking is tracex1_kern.c
  which does skb->dev->name == "lo" filtering.

In all cases the programs are called before trace buffer is allocated to
minimize the overhead, since we want to filter huge number of events, but
buffer alloc/free and argument copy for every event is too costly.
Theoretically we can invoke programs after buffer is allocated, but it
doesn't seem needed, since above approach is faster and achieves the same.

Note, tracepoint/syscall and kprobe programs are two different types:
BPF_PROG_TYPE_TRACING_FILTER and BPF_PROG_TYPE_KPROBE_FILTER,
since they expect different input.
Both use the same set of helper functions:
- map access (lookup/update/delete)
- fetch (probe_kernel_read wrappers)
- memcmp (probe_kernel_read + memcmp)
- dump_stack
- trace_printk
The last two are mainly to debug the programs and to print data for user
space consumptions.

Portability:
- kprobe programs are architecture dependent and need user scripting
  language like ktap/stap/dtrace/perf that will dynamically generate
  them based on debug info in vmlinux
- tracepoint programs are architecture independent, but if arbitrary pointer
  walking (with fetch() helpers) is used, they need data struct layout to match.
  Debug info is not necessary
- for networking use case we need to access 'struct sk_buff' fields in portable
  way (user space needs to fetch packet length without knowing skb->len offset),
  so for some frequently used data structures we will add helper functions
  or pseudo instructions to access them. I've hacked few ways specifically
  for skb, but abandoned them in favor of more generic type/field infra.
  That work is still wip. Not part of this set.
  Once it's ready tracepoint programs that access common data structs
  will be kernel independent.

Program return value:
- programs return 0 to discard an event
- and return non-zero to proceed with event (allocate trace buffer, copy
  arguments there and print it eventually in trace_pipe in traditional way)

Examples:
- dropmon.c - simple kfree_skb() accounting in eBPF assembler, similar
  to dropmon tool
- tracex1_kern.c - does net/netif_receive_skb event filtering
  for dev->skb->name == "lo" condition
- tracex2_kern.c - same kfree_skb() accounting like dropmon, but now in C
  plus computes histogram of all write sizes from sys_write syscall
  and prints the histogram in userspace
- tracex3_kern.c - most sophisticated example that computes IO 

Re: [PATCH v2 1/2] mm/slub: optimize alloc/free fastpath by removing preemption on/off

2015-01-15 Thread Steven Rostedt
On Thu, 15 Jan 2015 21:57:58 -0600 (CST)
Christoph Lameter  wrote:

> > I get:
> >
> > mov%gs:0x18(%rax),%rdx
> >
> > Looks to me that %gs is used.
> 
> %gs is used as a segment prefix. That does not add significant cycles.
> Retrieving the content of %gs and loading it into another register
> would be expensive in terms of cpu cycles.

OK, maybe that's what I saw in my previous benchmarks. Again, that was
a while ago.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 1/2] mm/slub: optimize alloc/free fastpath by removing preemption on/off

2015-01-15 Thread Steven Rostedt
On Thu, 15 Jan 2015 22:51:30 -0500
Steven Rostedt  wrote:

> 
> I haven't done benchmarks in a while, so perhaps accessing the %gs
> segment isn't as expensive as I saw it before. I'll have to profile
> function tracing on my i7 and see where things are slow again.

I just ran it on my i7, and yeah, the %gs access isn't much worse than
any of the other instructions. I had an old box that recently died that
I did my last benchmarks on, so that was probably why it made such a
difference.

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 1/2] mm/slub: optimize alloc/free fastpath by removing preemption on/off

2015-01-15 Thread Christoph Lameter
> I get:
>
>   mov%gs:0x18(%rax),%rdx
>
> Looks to me that %gs is used.

%gs is used as a segment prefix. That does not add significant cycles.
Retrieving the content of %gs and loading it into another register would
be expensive in terms of cpu cycles.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 13/28] PCI/x86: Refine pci_acpi_scan_root() with generic pci_host_bridge

2015-01-15 Thread Yijing Wang
Signed-off-by: Yijing Wang 
---
 arch/x86/pci/acpi.c |   36 
 1 files changed, 20 insertions(+), 16 deletions(-)

diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c
index 8edea63..f9a55c2 100644
--- a/arch/x86/pci/acpi.c
+++ b/arch/x86/pci/acpi.c
@@ -467,6 +467,18 @@ static void probe_pci_root_info(struct pci_root_info *info,
info);
 }
 
+static int pci_host_bridge_prepare(struct pci_host_bridge *bridge)
+{
+   struct pci_sysdata *sd = dev_get_drvdata(>dev);
+
+   ACPI_COMPANION_SET(>dev, sd->companion);
+   return 0;
+}
+
+static struct pci_host_bridge_ops phb_ops = {
+   .phb_prepare = pci_host_bridge_prepare,
+};
+
 struct pci_bus *pci_acpi_scan_root(struct acpi_pci_root *root)
 {
struct acpi_device *device = root->device;
@@ -475,6 +487,7 @@ struct pci_bus *pci_acpi_scan_root(struct acpi_pci_root 
*root)
int busnum = root->secondary.start;
LIST_HEAD(resources);
struct pci_bus *bus;
+   struct pci_host_bridge *host = NULL;
struct pci_sysdata *sd;
int node;
 
@@ -537,14 +550,13 @@ struct pci_bus *pci_acpi_scan_root(struct acpi_pci_root 
*root)
 
if (!setup_mcfg_map(info, domain, (u8)root->secondary.start,
(u8)root->secondary.end, 
root->mcfg_addr)) 
-   bus = pci_create_root_bus(NULL, PCI_DOMBUS(domain, 
busnum), 
-   _root_ops, sd, );
-
-   if (bus) {
-   pci_scan_child_bus(bus);
-   pci_set_host_bridge_release(
-   to_pci_host_bridge(bus->bridge),
-   release_pci_root_info, info);
+   host = pci_scan_root_bridge(NULL, PCI_DOMBUS(domain, 
busnum),
+   _root_ops, sd, , 
_ops);
+
+   if (host) {
+   bus = host->bus;
+   pci_set_host_bridge_release(host, release_pci_root_info,
+   info);
} else {
pci_free_resource_list();
__release_pci_root_info(info);
@@ -566,14 +578,6 @@ struct pci_bus *pci_acpi_scan_root(struct acpi_pci_root 
*root)
return bus;
 }
 
-int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge)
-{
-   struct pci_sysdata *sd = bridge->bus->sysdata;
-
-   ACPI_COMPANION_SET(>dev, sd->companion);
-   return 0;
-}
-
 int __init pci_acpi_init(void)
 {
struct pci_dev *dev = NULL;
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 14/28] PCI/IA64: Refine pci_acpi_scan_root() with generic pci_host_bridge

2015-01-15 Thread Yijing Wang
From: Yijing Wang 

Signed-off-by: Yijing Wang 
---
 arch/ia64/pci/pci.c |   34 ++
 1 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/arch/ia64/pci/pci.c b/arch/ia64/pci/pci.c
index e457015..7736c02 100644
--- a/arch/ia64/pci/pci.c
+++ b/arch/ia64/pci/pci.c
@@ -420,6 +420,18 @@ probe_pci_root_info(struct pci_root_info *info, struct 
acpi_device *device,
return 0;
 }
 
+static int pci_host_bridge_prepare(struct pci_host_bridge *bridge)
+{
+   struct pci_sysdata *sd = dev_get_drvdata(>dev);
+
+   ACPI_COMPANION_SET(>dev, sd->companion);
+   return 0;
+}
+
+static struct pci_host_bridge_ops phb_ops = {
+   .phb_prepare = pci_host_bridge_prepare,
+};
+
 struct pci_bus *pci_acpi_scan_root(struct acpi_pci_root *root)
 {
struct acpi_device *device = root->device;
@@ -428,7 +440,7 @@ struct pci_bus *pci_acpi_scan_root(struct acpi_pci_root 
*root)
struct pci_controller *controller;
struct pci_root_info *info = NULL;
int busnum = root->secondary.start;
-   struct pci_bus *pbus;
+   struct pci_host_bridge *host;
int ret;
 
controller = alloc_pci_controller(domain);
@@ -465,26 +477,16 @@ struct pci_bus *pci_acpi_scan_root(struct acpi_pci_root 
*root)
 * should handle the case here, but it appears that IA64 hasn't
 * such quirk. So we just ignore the case now.
 */
-   pbus = pci_create_root_bus(NULL, PCI_DOMBUS(domain, bus), 
-   _root_ops, controller, >resources);
-   if (!pbus) {
+   host = pci_scan_root_bridge(NULL, PCI_DOMBUS(domain, bus),
+   _root_ops, controller, >resources, _ops);
+   if (!host) {
pci_free_resource_list(>resources);
__release_pci_root_info(info);
return NULL;
}
 
-   pci_set_host_bridge_release(to_pci_host_bridge(pbus->bridge),
-   release_pci_root_info, info);
-   pci_scan_child_bus(pbus);
-   return pbus;
-}
-
-int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge)
-{
-   struct pci_controller *controller = bridge->bus->sysdata;
-
-   ACPI_COMPANION_SET(>dev, controller->companion);
-   return 0;
+   pci_set_host_bridge_release(host, release_pci_root_info, info);
+   return host->bus;
 }
 
 static int is_valid_resource(struct pci_dev *dev, int idx)
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 12/28] PCI: Introduce new scan function pci_scan_root_bridge()

2015-01-15 Thread Yijing Wang
Introduce new scan function pci_scan_root_bridge() to
support host bridge drivers that need to provide platform
own pci_host_bridge_ops.

Signed-off-by: Yijing Wang 
---
 drivers/pci/probe.c |   21 +
 include/linux/pci.h |3 +++
 2 files changed, 24 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 5f748ed..51d69c3 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2068,6 +2068,27 @@ static struct pci_bus *__pci_scan_root_bus(
return b;
 }
 
+struct pci_host_bridge *pci_scan_root_bridge(struct device *parent,
+   u32 db, struct pci_ops *ops, void *sysdata,
+   struct list_head *resources, struct pci_host_bridge_ops 
*phb_ops)
+{
+   struct pci_host_bridge *host;
+   struct pci_bus *bus;
+
+   host = pci_create_host_bridge(parent, db, resources,
+   sysdata, phb_ops);
+   if (!host)
+   return NULL;
+
+   bus = __pci_scan_root_bus(host, ops);
+   if (!bus)
+   pci_free_host_bridge(host);
+
+   return host;
+}
+EXPORT_SYMBOL(pci_scan_root_bridge);
+
+
 struct pci_bus *pci_scan_root_bus(struct device *parent, u32 db,
struct pci_ops *ops, void *sysdata, struct list_head *resources)
 {
diff --git a/include/linux/pci.h b/include/linux/pci.h
index c06b95d..5592737 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -797,6 +797,9 @@ void pci_bus_release_busn_res(struct pci_bus *b);
 struct pci_bus *pci_scan_root_bus(struct device *parent, u32 dombus,
 struct pci_ops *ops, void *sysdata,
 struct list_head *resources);
+struct pci_host_bridge *pci_scan_root_bridge(struct device *parent,
+   u32 dombus, struct pci_ops *ops, void *sysdata,
+   struct list_head *resources, struct pci_host_bridge_ops 
*phb_ops);
 struct pci_bus *pci_add_new_bus(struct pci_bus *parent, struct pci_dev *dev,
int busnr);
 void pcie_update_link_speed(struct pci_bus *bus, u16 link_status);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 11/28] PCI: Introduce pci_host_bridge_ops to setup host bridge

2015-01-15 Thread Yijing Wang
Now we have weak functions like pcibios_root_bridge_prepare()
to setup pci host bridge, We could introduce pci_host_bridge_ops
which contain host bridge specific ops to setup pci_host_bridge.
Then host bridge driver could add pci_host_bridge_ops hooks
intead of weak function to setup pci_host_bridge.
This patch add following pci_host_bridge_ops hooks:

pci_host_bridge_ops {
/* set root bus speed, some platform need this like powerpc */
void (*phb_set_root_bus_speed)(struct pci_host_bridge *host);
/* setup pci_host_bridge before pci_host_bridge be added to driver core 
*/
int (*phb_prepare)(struct pci_host_bridge *host);
/* probe whether pci_host_bridge scan mode is of mode */
void (*phb_probe_mode)(struct pci_host_bridge *);
/* platform specific of scan hook to scan pci device */
void (*phb_of_scan_bus)(struct pci_host_bridge *);
}

Signed-off-by: Yijing Wang 
---
 drivers/pci/host-bridge.c |   12 ++--
 drivers/pci/probe.c   |   19 ++-
 include/linux/pci.h   |   16 ++--
 3 files changed, 38 insertions(+), 9 deletions(-)

diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c
index 0b6ba5c..ccbf168 100644
--- a/drivers/pci/host-bridge.c
+++ b/drivers/pci/host-bridge.c
@@ -23,8 +23,8 @@ static void pci_release_host_bridge_dev(struct device *dev)
 }
 
 struct pci_host_bridge *pci_create_host_bridge(
-   struct device *parent, u32 db, 
-   struct list_head *resources, void *sysdata)
+   struct device *parent, u32 db, struct list_head *resources,
+   void *sysdata, struct pci_host_bridge_ops *ops)
 {
int error;
int bus = PCI_BUSNUM(db);
@@ -56,6 +56,7 @@ struct pci_host_bridge *pci_create_host_bridge(
}
mutex_unlock(_mutex);
 
+   host->ops = ops;
host->dev.parent = parent;
INIT_LIST_HEAD(>windows);
host->dev.release = pci_release_host_bridge_dev;
@@ -63,6 +64,13 @@ struct pci_host_bridge *pci_create_host_bridge(
dev_set_name(>dev, "pci%04x:%02x", host->domain, 
host->busnum);
 
+   if (host->ops && host->ops->phb_prepare) {
+   error = host->ops->phb_prepare(host);
+   if(error) {
+   kfree(host);
+   return NULL;
+   }
+   }
error = device_register(>dev);
if (error) {
put_device(>dev);
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 98a8d97..5f748ed 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1888,6 +1888,8 @@ static struct pci_bus *__pci_create_root_bus(
 
bridge->bus = b;
b->bridge = get_device(>dev);
+   if (bridge->ops && bridge->ops->phb_set_root_bus_speed)
+   bridge->ops->phb_set_root_bus_speed(bridge);
error = pcibios_root_bridge_prepare(bridge);
if (error)
goto err_out;
@@ -1953,7 +1955,7 @@ struct pci_bus *pci_create_root_bus(struct device 
*parent, u32 db,
 {
struct pci_host_bridge *host;
 
-   host = pci_create_host_bridge(parent, db, resources, sysdata);
+   host = pci_create_host_bridge(parent, db, resources, sysdata, NULL);
if (!host)
return NULL;

@@ -2051,10 +2053,17 @@ static struct pci_bus *__pci_scan_root_bus(
pci_bus_insert_busn_res(b, host->busnum, 255);
}
 
-   max = pci_scan_child_bus(b);
+   if (host->ops && host->ops->phb_probe_mode)
+   host->ops->phb_probe_mode(host);
 
-   if (!found)
-   pci_bus_update_busn_res_end(b, max);
+   if (host->of_scan) {
+   if (host->ops &>ops->phb_of_scan_bus)
+   host->ops->phb_of_scan_bus(host);
+   } else {
+   max = pci_scan_child_bus(b);
+   if (!found)
+   pci_bus_update_busn_res_end(b, max);
+   }
 
return b;
 }
@@ -2064,7 +2073,7 @@ struct pci_bus *pci_scan_root_bus(struct device *parent, 
u32 db,
 {
struct pci_host_bridge *host;
 
-   host = pci_create_host_bridge(parent, db, resources, sysdata);
+   host = pci_create_host_bridge(parent, db, resources, sysdata, NULL);
if (!host)
return NULL;
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 3ee8436..c06b95d 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -401,13 +401,25 @@ struct pci_host_bridge_window {
resource_size_t offset; /* bus address + offset = CPU address */
 };
 
+struct pci_host_bridge;
+struct pci_host_bridge_ops {
+   void (*phb_set_root_bus_speed)(struct pci_host_bridge *host);
+   int (*phb_prepare)(struct pci_host_bridge *host);
+   /* Override domain number by host specific .phv_assign_domain_nr()  */
+   void (*phb_assign_domain_nr)(struct pci_host_bridge *);
+   void (*phb_probe_mode)(struct 

[PATCH 10/28] PCI: Save sysdata in pci_host_bridge drvdata

2015-01-15 Thread Yijing Wang
Save platform specific sysdata in pci_host_bridge
drvdata, host bridge specific operation need to
access it before the pci bus creation.

Signed-off-by: Yijing Wang 
---
 drivers/pci/host-bridge.c |4 +++-
 drivers/pci/probe.c   |   18 --
 include/linux/pci.h   |3 ++-
 3 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c
index c9ee582..0b6ba5c 100644
--- a/drivers/pci/host-bridge.c
+++ b/drivers/pci/host-bridge.c
@@ -23,7 +23,8 @@ static void pci_release_host_bridge_dev(struct device *dev)
 }
 
 struct pci_host_bridge *pci_create_host_bridge(
-   struct device *parent, u32 db, struct list_head *resources)
+   struct device *parent, u32 db,
+   struct list_head *resources, void *sysdata)
 {
int error;
int bus = PCI_BUSNUM(db);
@@ -58,6 +59,7 @@ struct pci_host_bridge *pci_create_host_bridge(
host->dev.parent = parent;
INIT_LIST_HEAD(>windows);
host->dev.release = pci_release_host_bridge_dev;
+   dev_set_drvdata(>dev, sysdata);
dev_set_name(>dev, "pci%04x:%02x", host->domain, 
host->busnum);
 
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 2e0b952..98a8d97 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1865,8 +1865,7 @@ void __weak pcibios_remove_bus(struct pci_bus *bus)
 }
 
 static struct pci_bus *__pci_create_root_bus(
-   struct pci_host_bridge *bridge, struct pci_ops *ops,
-   void *sysdata)
+   struct pci_host_bridge *bridge, struct pci_ops *ops)
 {
int error;
struct pci_bus *b;
@@ -1882,7 +1881,7 @@ static struct pci_bus *__pci_create_root_bus(
if (!b)
return NULL;
 
-   b->sysdata = sysdata;
+   b->sysdata = dev_get_drvdata(>dev);
b->ops = ops;
b->number = b->busn_res.start = bridge->busnum;
pci_bus_assign_domain_nr(b, parent);
@@ -1954,11 +1953,11 @@ struct pci_bus *pci_create_root_bus(struct device 
*parent, u32 db,
 {
struct pci_host_bridge *host;
 
-   host = pci_create_host_bridge(parent, db, resources);
+   host = pci_create_host_bridge(parent, db, resources, sysdata);
if (!host)
return NULL;

-   return __pci_create_root_bus(host, ops, sysdata);
+   return __pci_create_root_bus(host, ops);
 }
 
 int pci_bus_insert_busn_res(struct pci_bus *b, int bus, int bus_max)
@@ -2025,8 +2024,7 @@ void pci_bus_release_busn_res(struct pci_bus *b)
 }
 
 static struct pci_bus *__pci_scan_root_bus(
-   struct pci_host_bridge *host, struct pci_ops *ops,
-   void *sysdata)
+   struct pci_host_bridge *host, struct pci_ops *ops)
 {
 
struct pci_host_bridge_window *window;
@@ -2040,7 +2038,7 @@ static struct pci_bus *__pci_scan_root_bus(
break;
}
 
-   b = __pci_create_root_bus(host, ops, sysdata);
+   b = __pci_create_root_bus(host, ops);
if (!b) {
pci_free_host_bridge(host);
return NULL;
@@ -2066,11 +2064,11 @@ struct pci_bus *pci_scan_root_bus(struct device 
*parent, u32 db,
 {
struct pci_host_bridge *host;
 
-   host = pci_create_host_bridge(parent, db, resources);
+   host = pci_create_host_bridge(parent, db, resources, sysdata);
if (!host)
return NULL;
 
-   return __pci_scan_root_bus(host, ops, sysdata);
+   return __pci_scan_root_bus(host, ops);
 }
 EXPORT_SYMBOL(pci_scan_root_bus);
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 5ee0033..3ee8436 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -419,7 +419,8 @@ void pci_set_host_bridge_release(struct pci_host_bridge 
*bridge,
 
 int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge);
 struct pci_host_bridge *pci_create_host_bridge(
-   struct device *parent, u32 dombus, struct list_head *resources);
+   struct device *parent, u32 dombus,
+   struct list_head *resources, void *sysdata);
 /*
  * The first PCI_BRIDGE_RESOURCE_NUM PCI bus resources (those that correspond
  * to P2P or CardBus bridge windows) go in a table.  Additional ones (for
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 17/28] PCI: Remove weak pcibios_root_bridge_prepare()

2015-01-15 Thread Yijing Wang
Now no one use weak pcibios_root_bridge_prepare(),
we could remove it.

Signed-off-by: Yijing Wang 
---
 drivers/pci/probe.c |   15 ---
 include/linux/pci.h |2 --
 2 files changed, 0 insertions(+), 17 deletions(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 51d69c3..30323ac 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1844,18 +1844,6 @@ unsigned int pci_scan_child_bus(struct pci_bus *bus)
 }
 EXPORT_SYMBOL_GPL(pci_scan_child_bus);
 
-/**
- * pcibios_root_bridge_prepare - Platform-specific host bridge setup.
- * @bridge: Host bridge to set up.
- *
- * Default empty implementation.  Replace with an architecture-specific setup
- * routine, if necessary.
- */
-int __weak pcibios_root_bridge_prepare(struct pci_host_bridge *bridge)
-{
-   return 0;
-}
-
 void __weak pcibios_add_bus(struct pci_bus *bus)
 {
 }
@@ -1890,9 +1878,6 @@ static struct pci_bus *__pci_create_root_bus(
b->bridge = get_device(>dev);
if (bridge->ops && bridge->ops->phb_set_root_bus_speed)
bridge->ops->phb_set_root_bus_speed(bridge);
-   error = pcibios_root_bridge_prepare(bridge);
-   if (error)
-   goto err_out;
 
device_enable_async_suspend(b->bridge);
pci_set_bus_of_node(b);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 5592737..da3a071 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -428,8 +428,6 @@ struct pci_host_bridge {
 void pci_set_host_bridge_release(struct pci_host_bridge *bridge,
 void (*release_fn)(struct pci_host_bridge *),
 void *release_data);
-
-int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge);
 struct pci_host_bridge *pci_create_host_bridge(
struct device *parent, u32 dombus, struct list_head *resources, 
void *sysdata, struct pci_host_bridge_ops *ops);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 19/28] PCI: Introduce pci_bus_child_max_busnr()

2015-01-15 Thread Yijing Wang
Sometimes, we need to know the highest reserved
busnr for children bus. Because parent's
bus->busn_res could have padding in it.
This function return the max child busnr as
pci_scan_child_bus().

Signed-off-by: Yijing Wang 
---
 drivers/pci/hotplug/acpiphp_glue.c |   29 +
 drivers/pci/pci.c  |   25 -
 include/linux/pci.h|2 +-
 3 files changed, 26 insertions(+), 30 deletions(-)

diff --git a/drivers/pci/hotplug/acpiphp_glue.c 
b/drivers/pci/hotplug/acpiphp_glue.c
index bcb90e4..84f2584 100644
--- a/drivers/pci/hotplug/acpiphp_glue.c
+++ b/drivers/pci/hotplug/acpiphp_glue.c
@@ -397,33 +397,6 @@ static void cleanup_bridge(struct acpiphp_bridge *bridge)
acpi_unlock_hp_context();
 }
 
-/**
- * acpiphp_max_busnr - return the highest reserved bus number under the given 
bus.
- * @bus: bus to start search with
- */
-static unsigned char acpiphp_max_busnr(struct pci_bus *bus)
-{
-   struct pci_bus *tmp;
-   unsigned char max, n;
-
-   /*
-* pci_bus_max_busnr will return the highest
-* reserved busnr for all these children.
-* that is equivalent to the bus->subordinate
-* value.  We don't want to use the parent's
-* bus->subordinate value because it could have
-* padding in it.
-*/
-   max = bus->busn_res.start;
-
-   list_for_each_entry(tmp, >children, node) {
-   n = pci_bus_max_busnr(tmp);
-   if (n > max)
-   max = n;
-   }
-   return max;
-}
-
 static void acpiphp_set_acpi_region(struct acpiphp_slot *slot)
 {
struct acpiphp_func *func;
@@ -489,7 +462,7 @@ static void enable_slot(struct acpiphp_slot *slot)
LIST_HEAD(add_list);
 
acpiphp_rescan_slot(slot);
-   max = acpiphp_max_busnr(bus);
+   max = pci_bus_child_max_busnr(bus);
for (pass = 0; pass < 2; pass++) {
list_for_each_entry(dev, >devices, bus_list) {
if (PCI_SLOT(dev->devfn) != slot->device)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 8b35e8e..a05f406 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -121,7 +121,30 @@ unsigned char pci_bus_max_busnr(struct pci_bus *bus)
}
return max;
 }
-EXPORT_SYMBOL_GPL(pci_bus_max_busnr);
+
+unsigned char pci_bus_child_max_busnr(struct pci_bus *bus)
+{
+   struct pci_bus *tmp;
+   unsigned char max, n;
+
+   /*
+* pci_bus_max_busnr will return the highest
+* reserved busnr for all these children.
+* that is equivalent to the bus->subordinate
+* value.  We don't want to use the parent's
+* bus->subordinate value because it could have
+* padding in it.
+*/
+   max = bus->busn_res.start;
+
+   list_for_each_entry(tmp, >children, node) {
+   n = pci_bus_max_busnr(tmp);
+   if (n > max)
+   max = n;
+   }
+   return max;
+}
+EXPORT_SYMBOL_GPL(pci_bus_child_max_busnr);
 
 #ifdef CONFIG_HAS_IOMEM
 void __iomem *pci_ioremap_bar(struct pci_dev *pdev, int bar)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index da3a071..2970a84 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1176,7 +1176,7 @@ int pci_scan_bridge(struct pci_bus *bus, struct pci_dev 
*dev, int max,
 void pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void *),
  void *userdata);
 int pci_cfg_space_size(struct pci_dev *dev);
-unsigned char pci_bus_max_busnr(struct pci_bus *bus);
+unsigned char pci_bus_child_max_busnr(struct pci_bus *bus);
 void pci_setup_bridge(struct pci_bus *bus);
 resource_size_t pcibios_window_alignment(struct pci_bus *bus,
 unsigned long type);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 15/28] PCI/powerpc: Rename pcibios_root_bridge_prepare() for better readability

2015-01-15 Thread Yijing Wang
Pcibios_root_bridge_prepare() in powerpc is used
to set root bus speed. So rename it to
pcibios_set_root_bus_speed() for better readability.

Signed-off-by: Yijing Wang 
---
 arch/powerpc/include/asm/machdep.h   |2 +-
 arch/powerpc/kernel/pci-common.c |4 ++--
 arch/powerpc/platforms/pseries/pci.c |2 +-
 arch/powerpc/platforms/pseries/pseries.h |2 +-
 arch/powerpc/platforms/pseries/setup.c   |2 +-
 5 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/machdep.h 
b/arch/powerpc/include/asm/machdep.h
index c8175a3..8e7f2a8 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -129,7 +129,7 @@ struct machdep_calls {
void(*pcibios_fixup)(void);
int (*pci_probe_mode)(struct pci_bus *);
void(*pci_irq_fixup)(struct pci_dev *dev);
-   int (*pcibios_root_bridge_prepare)(struct pci_host_bridge
+   int (*pcibios_set_root_bus_speed)(struct pci_host_bridge
*bridge);
 
/* To setup PHBs when using automatic OF platform driver for PCI */
diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index 927c3dd..2cf941e 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -769,8 +769,8 @@ int pci_proc_domain(struct pci_bus *bus)
 
 int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge)
 {
-   if (ppc_md.pcibios_root_bridge_prepare)
-   return ppc_md.pcibios_root_bridge_prepare(bridge);
+   if (ppc_md.pcibios_set_root_bus_speed)
+   return ppc_md.pcibios_set_root_bus_speed(bridge);
 
return 0;
 }
diff --git a/arch/powerpc/platforms/pseries/pci.c 
b/arch/powerpc/platforms/pseries/pci.c
index fe16a50..af685d6 100644
--- a/arch/powerpc/platforms/pseries/pci.c
+++ b/arch/powerpc/platforms/pseries/pci.c
@@ -110,7 +110,7 @@ static void fixup_winbond_82c105(struct pci_dev* dev)
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_WINBOND, PCI_DEVICE_ID_WINBOND_82C105,
 fixup_winbond_82c105);
 
-int pseries_root_bridge_prepare(struct pci_host_bridge *bridge)
+int pseries_set_root_bus_speed(struct pci_host_bridge *bridge)
 {
struct device_node *dn, *pdn;
struct pci_bus *bus;
diff --git a/arch/powerpc/platforms/pseries/pseries.h 
b/arch/powerpc/platforms/pseries/pseries.h
index 1796c54..5d0be3a 100644
--- a/arch/powerpc/platforms/pseries/pseries.h
+++ b/arch/powerpc/platforms/pseries/pseries.h
@@ -63,7 +63,7 @@ extern int dlpar_detach_node(struct device_node *);
 
 /* PCI root bridge prepare function override for pseries */
 struct pci_host_bridge;
-int pseries_root_bridge_prepare(struct pci_host_bridge *bridge);
+int pseries_set_root_bus_speed(struct pci_host_bridge *bridge);
 
 unsigned long pseries_memory_block_size(void);
 
diff --git a/arch/powerpc/platforms/pseries/setup.c 
b/arch/powerpc/platforms/pseries/setup.c
index e445b67..b196c0d 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -496,7 +496,7 @@ static void __init pSeries_setup_arch(void)
ppc_md.enable_pmcs = power4_enable_pmcs;
}
 
-   ppc_md.pcibios_root_bridge_prepare = pseries_root_bridge_prepare;
+   ppc_md.pcibios_set_root_bus_speed = pseries_set_root_bus_speed;
 
if (firmware_has_feature(FW_FEATURE_SET_MODE)) {
long rc;
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 23/28] PCI/designware: Use pci_scan_root_bus() for simplicity

2015-01-15 Thread Yijing Wang
Use pci_scan_root_bus() instead of pci_create_root_bus() +
pci_scan_child_bus() for simplicity.

Signed-off-by: Yijing Wang 
---
 drivers/pci/host/pcie-designware.c |4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/host/pcie-designware.c 
b/drivers/pci/host/pcie-designware.c
index eef3111..d37fe27 100644
--- a/drivers/pci/host/pcie-designware.c
+++ b/drivers/pci/host/pcie-designware.c
@@ -725,13 +725,11 @@ static struct pci_bus *dw_pcie_scan_bus(int nr, struct 
pci_sys_data *sys)
struct pcie_port *pp = sys_to_pcie(sys);
 
pp->root_bus_nr = sys->busnr;
-   bus = pci_create_root_bus(pp->dev, sys->busnr,
+   bus = pci_scan_root_bus(pp->dev, sys->busnr,
  _pcie_ops, sys, >resources);
if (!bus)
return NULL;
 
-   pci_scan_child_bus(bus);
-
if (bus && pp->ops->scan_bus)
pp->ops->scan_bus(pp);
 
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 25/28] PCI: Rename __pci_create_root_bus() to pci_create_root_bus()

2015-01-15 Thread Yijing Wang
Now no one use pci_create_root_bus(), we could remove it
and rename __pci_create_root_bus() to pci_create_root_bus().

Signed-off-by: wangyij...@huawei.com
---
 drivers/pci/probe.c |   27 ---
 include/linux/pci.h |3 ---
 2 files changed, 8 insertions(+), 22 deletions(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 30323ac..0817910 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1852,7 +1852,7 @@ void __weak pcibios_remove_bus(struct pci_bus *bus)
 {
 }
 
-static struct pci_bus *__pci_create_root_bus(
+static struct pci_bus *pci_create_root_bus(
struct pci_host_bridge *bridge, struct pci_ops *ops)
 {
int error;
@@ -1935,18 +1935,6 @@ err_out:
return NULL;
 }
 
-struct pci_bus *pci_create_root_bus(struct device *parent, u32 db,
-   struct pci_ops *ops, void *sysdata, struct list_head *resources)
-{
-   struct pci_host_bridge *host;
-
-   host = pci_create_host_bridge(parent, db, resources, sysdata, NULL);
-   if (!host)
-   return NULL;
-   
-   return __pci_create_root_bus(host, ops);
-}
-
 int pci_bus_insert_busn_res(struct pci_bus *b, int bus, int bus_max)
 {
struct resource *res = >busn_res;
@@ -2025,7 +2013,7 @@ static struct pci_bus *__pci_scan_root_bus(
break;
}
 
-   b = __pci_create_root_bus(host, ops);
+   b = pci_create_root_bus(host, ops);
if (!b) {
pci_free_host_bridge(host);
return NULL;
@@ -2091,18 +2079,19 @@ struct pci_bus *pci_scan_bus_legacy(u32 db, struct 
pci_ops *ops,
void *sysdata)
 {
LIST_HEAD(resources);
-   struct pci_bus *b;
+   struct pci_host_bridge *host;
 
pci_add_resource(, _resource);
pci_add_resource(, _resource);
pci_add_resource(, _resource);
-   b = pci_create_root_bus(NULL, db, ops, sysdata, );
-   if (b) {
-   pci_scan_child_bus(b);
+   host = pci_create_host_bridge(NULL, db, sysdata, , NULL);
+   if (host) {
+   __pci_scan_root_bus(host, ops);
+   return host->bus;
} else {
pci_free_resource_list();
}
-   return b;
+   return NULL;
 }
 EXPORT_SYMBOL(pci_scan_bus_legacy);
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 2970a84..9ddca3b 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -785,9 +785,6 @@ void pcibios_scan_specific_bus(int busn);
 struct pci_bus *pci_find_bus(int domain, int busnr);
 void pci_bus_add_devices(const struct pci_bus *bus);
 struct pci_bus *pci_scan_bus_legacy(u32 dombus, struct pci_ops *ops, void 
*sysdata);
-struct pci_bus *pci_create_root_bus(struct device *parent, u32 dombus,
-   struct pci_ops *ops, void *sysdata,
-   struct list_head *resources);
 void pci_free_host_bridge(struct pci_host_bridge *host);
 int pci_bus_insert_busn_res(struct pci_bus *b, int bus, int busmax);
 int pci_bus_update_busn_res_end(struct pci_bus *b, int busmax);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 28/28] PCI: Remove pci_bus_assign_domain_nr()

2015-01-15 Thread Yijing Wang
Now we save the domain number in pci_host_bridge,
we could remove pci_bus_assign_domain_nr() and
clean the domain member in pci_bus.

Signed-off-by: Yijing Wang 
---
 drivers/pci/pci.c   |   48 +---
 drivers/pci/probe.c |   12 
 include/linux/pci.h |3 ---
 3 files changed, 5 insertions(+), 58 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 72232d4..3eb3d8e 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4472,6 +4472,7 @@ int pci_domain_nr(struct pci_bus *bus)
return host->domain;
 }
 EXPORT_SYMBOL_GPL(pci_domain_nr);
+#endif
 
 #ifdef CONFIG_PCI_DOMAINS_GENERIC
 void pci_host_assign_domain_nr(struct pci_host_bridge *host)
@@ -4519,53 +4520,6 @@ void pci_host_assign_domain_nr(struct pci_host_bridge 
*host)
 
host->domain_nr = domain;
 }
-
-
-void pci_bus_assign_domain_nr(struct pci_bus *bus, struct device *parent)
-{
-   static int use_dt_domains = -1;
-   int domain = of_get_pci_domain_nr(parent->of_node);
-
-   /*
-* Check DT domain and use_dt_domains values.
-*
-* If DT domain property is valid (domain >= 0) and
-* use_dt_domains != 0, the DT assignment is valid since this means
-* we have not previously allocated a domain number by using
-* pci_get_new_domain_nr(); we should also update use_dt_domains to
-* 1, to indicate that we have just assigned a domain number from
-* DT.
-*
-* If DT domain property value is not valid (ie domain < 0), and we
-* have not previously assigned a domain number from DT
-* (use_dt_domains != 1) we should assign a domain number by
-* using the:
-*
-* pci_get_new_domain_nr()
-*
-* API and update the use_dt_domains value to keep track of method we
-* are using to assign domain numbers (use_dt_domains = 0).
-*
-* All other combinations imply we have a platform that is trying
-* to mix domain numbers obtained from DT and pci_get_new_domain_nr(),
-* which is a recipe for domain mishandling and it is prevented by
-* invalidating the domain value (domain = -1) and printing a
-* corresponding error.
-*/
-   if (domain >= 0 && use_dt_domains) {
-   use_dt_domains = 1;
-   } else if (domain < 0 && use_dt_domains != 1) {
-   use_dt_domains = 0;
-   domain = pci_get_new_domain_nr();
-   } else {
-   dev_err(parent, "Node %s has inconsistent \"linux,pci-domain\" 
property in DT\n",
-   parent->of_node->full_name);
-   domain = -1;
-   }
-
-   bus->domain_nr = domain;
-}
-#endif
 #endif
 
 /**
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 0817910..d8b76ef 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -480,7 +480,7 @@ void pci_read_bridge_bases(struct pci_bus *child)
}
 }
 
-static struct pci_bus *pci_alloc_bus(struct pci_bus *parent)
+static struct pci_bus *pci_alloc_bus(void)
 {
struct pci_bus *b;
 
@@ -495,10 +495,7 @@ static struct pci_bus *pci_alloc_bus(struct pci_bus 
*parent)
INIT_LIST_HEAD(>resources);
b->max_bus_speed = PCI_SPEED_UNKNOWN;
b->cur_bus_speed = PCI_SPEED_UNKNOWN;
-#ifdef CONFIG_PCI_DOMAINS_GENERIC
-   if (parent)
-   b->domain_nr = parent->domain_nr;
-#endif
+
return b;
 }
 
@@ -645,7 +642,7 @@ static struct pci_bus *pci_alloc_child_bus(struct pci_bus 
*parent,
/*
 * Allocate a new bus, and inherit stuff from the parent..
 */
-   child = pci_alloc_bus(parent);
+   child = pci_alloc_bus();
if (!child)
return NULL;
 
@@ -1865,14 +1862,13 @@ static struct pci_bus *pci_create_root_bus(
char *fmt;
 
parent = bridge->dev.parent;
-   b = pci_alloc_bus(NULL);
+   b = pci_alloc_bus();
if (!b)
return NULL;
 
b->sysdata = dev_get_drvdata(>dev);
b->ops = ops;
b->number = b->busn_res.start = bridge->busnum;
-   pci_bus_assign_domain_nr(b, parent);
 
bridge->bus = b;
b->bridge = get_device(>dev);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index e7ca546..ac25d2a 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -475,9 +475,6 @@ struct pci_bus {
unsigned char   primary;/* number of primary bridge */
unsigned char   max_bus_speed;  /* enum pci_bus_speed */
unsigned char   cur_bus_speed;  /* enum pci_bus_speed */
-#ifdef CONFIG_PCI_DOMAINS_GENERIC
-   int domain_nr;
-#endif
 
charname[48];
 
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  

[PATCH 03/28] xen/PCI: Don't use deprecated function pci_scan_bus_parented()

2015-01-15 Thread Yijing Wang
From: Arnd Bergmann 

Use pci_scan_root_bus() instead of deprecated function
pci_scan_bus_parented().

Signed-off-by: Arnd Bergmann 
Signed-off-by: Yijing Wang 
CC: Konrad Rzeszutek Wilk 
CC: xen-de...@lists.xenproject.org
---
 drivers/pci/xen-pcifront.c |   10 +++---
 1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/xen-pcifront.c b/drivers/pci/xen-pcifront.c
index b1ffebe..240ddbc 100644
--- a/drivers/pci/xen-pcifront.c
+++ b/drivers/pci/xen-pcifront.c
@@ -446,6 +446,7 @@ static int pcifront_scan_root(struct pcifront_device *pdev,
 unsigned int domain, unsigned int bus)
 {
struct pci_bus *b;
+   LIST_HEAD(resources);
struct pcifront_sd *sd = NULL;
struct pci_bus_entry *bus_entry = NULL;
int err = 0;
@@ -470,17 +471,20 @@ static int pcifront_scan_root(struct pcifront_device 
*pdev,
err = -ENOMEM;
goto err_out;
}
+   pci_add_resource(, _resource);
+   pci_add_resource(, _resource);
pcifront_init_sd(sd, domain, bus, pdev);
 
pci_lock_rescan_remove();
 
-   b = pci_scan_bus_parented(>xdev->dev, bus,
- _bus_ops, sd);
+   b = pci_scan_root_bus(>xdev->dev, bus,
+ _bus_ops, sd, );
if (!b) {
dev_err(>xdev->dev,
"Error creating PCI Frontend Bus!\n");
err = -ENOMEM;
pci_unlock_rescan_remove();
+   pci_free_resource_list();
goto err_out;
}
 
@@ -488,7 +492,7 @@ static int pcifront_scan_root(struct pcifront_device *pdev,
 
list_add(_entry->list, >root_buses);
 
-   /* pci_scan_bus_parented skips devices which do not have a have
+   /* pci_scan_root_bus skips devices which do not have a have
* devfn==0. The pcifront_scan_bus enumerates all devfn. */
err = pcifront_scan_bus(pdev, domain, bus, b);
 
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 24/28] PCI/xgene: Use pci_scan_root_bus() instead of pci_create_root_bus()

2015-01-15 Thread Yijing Wang
Use pci_scan_root_bus() instead of pci_create_root_bus() +
pci_scan_child_bus() for simplicity.

Signed-off-by: Yijing Wang 
---
 drivers/pci/host/pci-xgene.c |3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/host/pci-xgene.c b/drivers/pci/host/pci-xgene.c
index b1d0596..7e46a39 100644
--- a/drivers/pci/host/pci-xgene.c
+++ b/drivers/pci/host/pci-xgene.c
@@ -631,12 +631,11 @@ static int xgene_pcie_probe_bridge(struct platform_device 
*pdev)
if (ret)
return ret;
 
-   bus = pci_create_root_bus(>dev, 0,
+   bus = pci_scan_root_bus(>dev, 0,
_pcie_ops, port, );
if (!bus)
return -ENOMEM;
 
-   pci_scan_child_bus(bus);
pci_assign_unassigned_bus_resources(bus);
pci_bus_add_devices(bus);
 
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 26/28] PCI: Export find_pci_host_bridge()

2015-01-15 Thread Yijing Wang
Export find_pci_host_bridge().

Signed-off-by: Yijing Wang 
---
 drivers/pci/host-bridge.c |2 +-
 include/linux/pci.h   |1 +
 2 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c
index ccbf168..74f7572 100644
--- a/drivers/pci/host-bridge.c
+++ b/drivers/pci/host-bridge.c
@@ -104,7 +104,7 @@ static struct pci_bus *find_pci_root_bus(struct pci_bus 
*bus)
return bus;
 }
 
-static struct pci_host_bridge *find_pci_host_bridge(struct pci_bus *bus)
+struct pci_host_bridge *find_pci_host_bridge(struct pci_bus *bus)
 {
struct pci_bus *root_bus = find_pci_root_bus(bus);
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 9ddca3b..3ebee9d 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -431,6 +431,7 @@ void pci_set_host_bridge_release(struct pci_host_bridge 
*bridge,
 struct pci_host_bridge *pci_create_host_bridge(
struct device *parent, u32 dombus, struct list_head *resources, 
void *sysdata, struct pci_host_bridge_ops *ops);
+struct pci_host_bridge *find_pci_host_bridge(struct pci_bus* bus);
 /*
  * The first PCI_BRIDGE_RESOURCE_NUM PCI bus resources (those that correspond
  * to P2P or CardBus bridge windows) go in a table.  Additional ones (for
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 05/28] PCI: Rename pci_scan_bus() to pci_scan_bus_legacy()

2015-01-15 Thread Yijing Wang
Pci_scan_bus() is called by legacy pci host drivers,
the legacy host drivers mean they use NUll as parent
device, use all IO/MEM as default resources. Rename
pci_scan_bus() to pci_scan_bus_legacy() for better
readability.

Signed-off-by: Yijing Wang 
---
 arch/alpha/kernel/sys_nautilus.c  |2 +-
 arch/m68k/coldfire/pci.c  |2 +-
 arch/sparc/kernel/pcic.c  |2 +-
 arch/unicore32/kernel/pci.c   |2 +-
 drivers/pci/hotplug/ibmphp_core.c |2 +-
 drivers/pci/probe.c   |4 ++--
 include/linux/pci.h   |2 +-
 7 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/alpha/kernel/sys_nautilus.c b/arch/alpha/kernel/sys_nautilus.c
index 4ae4a40..2c864bb 100644
--- a/arch/alpha/kernel/sys_nautilus.c
+++ b/arch/alpha/kernel/sys_nautilus.c
@@ -206,7 +206,7 @@ nautilus_init_pci(void)
unsigned long memtop = max_low_pfn << PAGE_SHIFT;
 
/* Scan our single hose.  */
-   bus = pci_scan_bus(0, alpha_mv.pci_ops, hose);
+   bus = pci_scan_bus_legacy(0, alpha_mv.pci_ops, hose);
hose->bus = bus;
pcibios_claim_one_bus(bus);
 
diff --git a/arch/m68k/coldfire/pci.c b/arch/m68k/coldfire/pci.c
index d45f087..0ef4dd4 100644
--- a/arch/m68k/coldfire/pci.c
+++ b/arch/m68k/coldfire/pci.c
@@ -312,7 +312,7 @@ static int __init mcf_pci_init(void)
set_current_state(TASK_UNINTERRUPTIBLE);
schedule_timeout(msecs_to_jiffies(200));
 
-   rootbus = pci_scan_bus(0, _pci_ops, NULL);
+   rootbus = pci_scan_bus_legacy(0, _pci_ops, NULL);
rootbus->resource[0] = _pci_io;
rootbus->resource[1] = _pci_mem;
 
diff --git a/arch/sparc/kernel/pcic.c b/arch/sparc/kernel/pcic.c
index 7a82fe2..f7edc97 100644
--- a/arch/sparc/kernel/pcic.c
+++ b/arch/sparc/kernel/pcic.c
@@ -390,7 +390,7 @@ static void __init pcic_pbm_scan_bus(struct linux_pcic 
*pcic)
 {
struct linux_pbm_info *pbm = >pbm;
 
-   pbm->pci_bus = pci_scan_bus(pbm->pci_first_busno, _ops, pbm);
+   pbm->pci_bus = pci_scan_bus_legacy(pbm->pci_first_busno, _ops, 
pbm);
if (pbm->pci_bus)
pci_bus_add_devices(pbm->pci_bus);
 #if 0 /* deadwood transplanted from sparc64 */
diff --git a/arch/unicore32/kernel/pci.c b/arch/unicore32/kernel/pci.c
index 3d82024..2e238b4 100644
--- a/arch/unicore32/kernel/pci.c
+++ b/arch/unicore32/kernel/pci.c
@@ -258,7 +258,7 @@ static int __init pci_common_init(void)
 
pci_puv3_preinit();
 
-   puv3_bus = pci_scan_bus(0, _puv3_ops, NULL);
+   puv3_bus = pci_scan_bus_legacy(0, _puv3_ops, NULL);
 
if (!puv3_bus)
panic("PCI: unable to scan bus!");
diff --git a/drivers/pci/hotplug/ibmphp_core.c 
b/drivers/pci/hotplug/ibmphp_core.c
index 86e3bfd..4ade1b4 100644
--- a/drivers/pci/hotplug/ibmphp_core.c
+++ b/drivers/pci/hotplug/ibmphp_core.c
@@ -765,7 +765,7 @@ static u8 bus_structure_fixup(u8 busno)
(l != 0x) && (l != 0x)) {
debug("%s - Inside bus_structure_fixup()\n",
__func__);
-   b = pci_scan_bus(busno, ibmphp_pci_bus->ops, NULL);
+   b = pci_scan_bus_legacy(busno, ibmphp_pci_bus->ops, 
NULL);
if (b)
pci_bus_add_devices(b);
break;
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index e44de73..ed894cb 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2091,7 +2091,7 @@ struct pci_bus *pci_scan_root_bus(struct device *parent, 
int bus,
 }
 EXPORT_SYMBOL(pci_scan_root_bus);
 
-struct pci_bus *pci_scan_bus(int bus, struct pci_ops *ops,
+struct pci_bus *pci_scan_bus_legacy(int bus, struct pci_ops *ops,
void *sysdata)
 {
LIST_HEAD(resources);
@@ -2108,7 +2108,7 @@ struct pci_bus *pci_scan_bus(int bus, struct pci_ops *ops,
}
return b;
 }
-EXPORT_SYMBOL(pci_scan_bus);
+EXPORT_SYMBOL(pci_scan_bus_legacy);
 
 /**
  * pci_rescan_bus_bridge_resize - scan a PCI bus for devices.
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 55b2c81..a6fa2f1 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -769,7 +769,7 @@ void pcibios_bus_to_resource(struct pci_bus *bus, struct 
resource *res,
 void pcibios_scan_specific_bus(int busn);
 struct pci_bus *pci_find_bus(int domain, int busnr);
 void pci_bus_add_devices(const struct pci_bus *bus);
-struct pci_bus *pci_scan_bus(int bus, struct pci_ops *ops, void *sysdata);
+struct pci_bus *pci_scan_bus_legacy(int bus, struct pci_ops *ops, void 
*sysdata);
 struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
struct pci_ops *ops, void *sysdata,
struct list_head *resources);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to 

[PATCH 27/28] PCI: Remove platform specific pci_domain_nr()

2015-01-15 Thread Yijing Wang
Now pci_host_bridge holds the domain number,
so we could eliminate all platform specific
pci_domain_nr().

Signed-off-by: Yijing Wang 
---
 arch/alpha/include/asm/pci.h |2 --
 arch/ia64/include/asm/pci.h  |1 -
 arch/microblaze/pci/pci-common.c |   11 ---
 arch/mips/include/asm/pci.h  |2 --
 arch/powerpc/kernel/pci-common.c |   11 ---
 arch/s390/pci/pci.c  |6 --
 arch/sh/include/asm/pci.h|2 --
 arch/sparc/kernel/pci.c  |   17 -
 arch/tile/include/asm/pci.h  |2 --
 arch/x86/include/asm/pci.h   |6 --
 drivers/pci/pci.c|8 
 include/linux/pci.h  |7 ++-
 12 files changed, 10 insertions(+), 65 deletions(-)

diff --git a/arch/alpha/include/asm/pci.h b/arch/alpha/include/asm/pci.h
index f7f680f..63a9a1e 100644
--- a/arch/alpha/include/asm/pci.h
+++ b/arch/alpha/include/asm/pci.h
@@ -95,8 +95,6 @@ static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, 
int channel)
return channel ? 15 : 14;
 }
 
-#define pci_domain_nr(bus) ((struct pci_controller *)(bus)->sysdata)->index
-
 static inline int pci_proc_domain(struct pci_bus *bus)
 {
struct pci_controller *hose = bus->sysdata;
diff --git a/arch/ia64/include/asm/pci.h b/arch/ia64/include/asm/pci.h
index 52af5ed..1dcea49 100644
--- a/arch/ia64/include/asm/pci.h
+++ b/arch/ia64/include/asm/pci.h
@@ -99,7 +99,6 @@ struct pci_controller {
 
 
 #define PCI_CONTROLLER(busdev) ((struct pci_controller *) busdev->sysdata)
-#define pci_domain_nr(busdev)(PCI_CONTROLLER(busdev)->segment)
 
 extern struct pci_ops pci_root_ops;
 
diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c
index 890bd36..81ac523 100644
--- a/arch/microblaze/pci/pci-common.c
+++ b/arch/microblaze/pci/pci-common.c
@@ -123,17 +123,6 @@ unsigned long pci_address_to_pio(phys_addr_t address)
 }
 EXPORT_SYMBOL_GPL(pci_address_to_pio);
 
-/*
- * Return the domain number for this bus.
- */
-int pci_domain_nr(struct pci_bus *bus)
-{
-   struct pci_controller *hose = pci_bus_to_host(bus);
-
-   return hose->global_number;
-}
-EXPORT_SYMBOL(pci_domain_nr);
-
 /* This routine is meant to be used early during boot, when the
  * PCI bus numbers have not yet been assigned, and you need to
  * issue PCI config cycles to an OF device.
diff --git a/arch/mips/include/asm/pci.h b/arch/mips/include/asm/pci.h
index 6952962..9546396 100644
--- a/arch/mips/include/asm/pci.h
+++ b/arch/mips/include/asm/pci.h
@@ -121,8 +121,6 @@ static inline void pci_dma_burst_advice(struct pci_dev 
*pdev,
 }
 #endif
 
-#define pci_domain_nr(bus) ((struct pci_controller *)(bus)->sysdata)->index
-
 static inline int pci_proc_domain(struct pci_bus *bus)
 {
struct pci_controller *hose = bus->sysdata;
diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index 0c7fb81..a482bd6 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -181,17 +181,6 @@ unsigned long pci_address_to_pio(phys_addr_t address)
 }
 EXPORT_SYMBOL_GPL(pci_address_to_pio);
 
-/*
- * Return the domain number for this bus.
- */
-int pci_domain_nr(struct pci_bus *bus)
-{
-   struct pci_controller *hose = pci_bus_to_host(bus);
-
-   return hose->global_number;
-}
-EXPORT_SYMBOL(pci_domain_nr);
-
 /* This routine is meant to be used early during boot, when the
  * PCI bus numbers have not yet been assigned, and you need to
  * issue PCI config cycles to an OF device.
diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
index 612decf..8ca02f7 100644
--- a/arch/s390/pci/pci.c
+++ b/arch/s390/pci/pci.c
@@ -101,12 +101,6 @@ static struct zpci_dev *get_zdev_by_bus(struct pci_bus 
*bus)
return (bus && bus->sysdata) ? (struct zpci_dev *) bus->sysdata : NULL;
 }
 
-int pci_domain_nr(struct pci_bus *bus)
-{
-   return ((struct zpci_dev *) bus->sysdata)->domain;
-}
-EXPORT_SYMBOL_GPL(pci_domain_nr);
-
 int pci_proc_domain(struct pci_bus *bus)
 {
return pci_domain_nr(bus);
diff --git a/arch/sh/include/asm/pci.h b/arch/sh/include/asm/pci.h
index 5b45115..4dc3ad6 100644
--- a/arch/sh/include/asm/pci.h
+++ b/arch/sh/include/asm/pci.h
@@ -109,8 +109,6 @@ static inline void pci_dma_burst_advice(struct pci_dev 
*pdev,
 /* Board-specific fixup routines. */
 int pcibios_map_platform_irq(const struct pci_dev *dev, u8 slot, u8 pin);
 
-#define pci_domain_nr(bus) ((struct pci_channel *)(bus)->sysdata)->index
-
 static inline int pci_proc_domain(struct pci_bus *bus)
 {
struct pci_channel *hose = bus->sysdata;
diff --git a/arch/sparc/kernel/pci.c b/arch/sparc/kernel/pci.c
index 42dc21f..519f121 100644
--- a/arch/sparc/kernel/pci.c
+++ b/arch/sparc/kernel/pci.c
@@ -906,23 +906,6 @@ int pcibus_to_node(struct pci_bus *pbus)
 EXPORT_SYMBOL(pcibus_to_node);
 #endif
 
-/* Return the domain number for this pci bus */
-
-int pci_domain_nr(struct pci_bus *pbus)
-{
-   struct pci_pbm_info 

[PATCH 01/28] PCI: Rip out pci_bus_add_devices() from pci_scan_bus()

2015-01-15 Thread Yijing Wang
Pci_bus_add_devices() should not be placed in pci_scan_bus().
Now pci device will be added to driver core once its
creation. All things left in pci_bus_add_devices() are
driver attachment and other trivial sysfs things.
Pci_scan_bus() should be the function responsible for
scanning PCI devices, not including driver attachment.
Other, some callers(m68k,unicore32,alpha) of pci_scan_bus()
will call pci_bus_size_bridges() and pci_bus_assign_resources()
after pci_scan_bus().

E.g.
In m68k
mcf_pci_init()
pci_scan_bus()
...
pci_bus_add_devices() --- try to attach driver
pci_fixup_irqs()
pci_bus_size_bridges()
pci_bus_assign_resources()

It is not correct, resources should be assigned correctly
before attaching driver.
So we should rip out pci_bus_add_devices() for better
code design. After applied this patch, pci_scan_bus()
should be used in flow like:

pci_scan_bus() (mandatory)
pci_fixup_irqs() (optional)
pci_bus_size_bridges() (optional)
pci_pci_bus_assign_resources() (optional)
pci_bus_add_devices() (mandatory)

Signed-off-by: Yijing Wang 
CC: "David S. Miller" 
CC: Geert Uytterhoeven 
CC: Guan Xuetao 
CC: linux-al...@vger.kernel.org
CC: linux-m...@lists.linux-m68k.org
CC: sparcli...@vger.kernel.org
---
 arch/alpha/kernel/sys_nautilus.c  |1 +
 arch/m68k/coldfire/pci.c  |1 +
 arch/sparc/kernel/pcic.c  |2 ++
 arch/unicore32/kernel/pci.c   |   11 +++
 drivers/pci/hotplug/ibmphp_core.c |6 --
 drivers/pci/probe.c   |1 -
 6 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/arch/alpha/kernel/sys_nautilus.c b/arch/alpha/kernel/sys_nautilus.c
index 837c0fa..4ae4a40 100644
--- a/arch/alpha/kernel/sys_nautilus.c
+++ b/arch/alpha/kernel/sys_nautilus.c
@@ -253,6 +253,7 @@ nautilus_init_pci(void)
   for the root bus, so just clear it. */
bus->self = NULL;
pci_fixup_irqs(alpha_mv.pci_swizzle, alpha_mv.pci_map_irq);
+   pci_bus_add_devices(bus);
 }
 
 /*
diff --git a/arch/m68k/coldfire/pci.c b/arch/m68k/coldfire/pci.c
index df96792..d45f087 100644
--- a/arch/m68k/coldfire/pci.c
+++ b/arch/m68k/coldfire/pci.c
@@ -319,6 +319,7 @@ static int __init mcf_pci_init(void)
pci_fixup_irqs(pci_common_swizzle, mcf_pci_map_irq);
pci_bus_size_bridges(rootbus);
pci_bus_assign_resources(rootbus);
+   pci_bus_add_devices(rootbus);
return 0;
 }
 
diff --git a/arch/sparc/kernel/pcic.c b/arch/sparc/kernel/pcic.c
index 6cc78c2..7a82fe2 100644
--- a/arch/sparc/kernel/pcic.c
+++ b/arch/sparc/kernel/pcic.c
@@ -391,6 +391,8 @@ static void __init pcic_pbm_scan_bus(struct linux_pcic 
*pcic)
struct linux_pbm_info *pbm = >pbm;
 
pbm->pci_bus = pci_scan_bus(pbm->pci_first_busno, _ops, pbm);
+   if (pbm->pci_bus)
+   pci_bus_add_devices(pbm->pci_bus);
 #if 0 /* deadwood transplanted from sparc64 */
pci_fill_in_pbm_cookies(pbm->pci_bus, pbm, pbm->prom_node);
pci_record_assignments(pbm, pbm->pci_bus);
diff --git a/arch/unicore32/kernel/pci.c b/arch/unicore32/kernel/pci.c
index 374a055..3d82024 100644
--- a/arch/unicore32/kernel/pci.c
+++ b/arch/unicore32/kernel/pci.c
@@ -266,17 +266,12 @@ static int __init pci_common_init(void)
pci_fixup_irqs(pci_common_swizzle, pci_puv3_map_irq);
 
if (!pci_has_flag(PCI_PROBE_ONLY)) {
-   /*
-* Size the bridge windows.
-*/
+   /* Size the bridge windows. */
pci_bus_size_bridges(puv3_bus);
-
-   /*
-* Assign resources.
-*/
+   /* Assign resources. */
pci_bus_assign_resources(puv3_bus);
}
-
+   pci_bus_add_devices(puv3_bus);
return 0;
 }
 subsys_initcall(pci_common_init);
diff --git a/drivers/pci/hotplug/ibmphp_core.c 
b/drivers/pci/hotplug/ibmphp_core.c
index 96c5c72..86e3bfd 100644
--- a/drivers/pci/hotplug/ibmphp_core.c
+++ b/drivers/pci/hotplug/ibmphp_core.c
@@ -738,7 +738,7 @@ static void ibm_unconfigure_device(struct pci_func *func)
  */
 static u8 bus_structure_fixup(u8 busno)
 {
-   struct pci_bus *bus;
+   struct pci_bus *bus, *b;
struct pci_dev *dev;
u16 l;
 
@@ -765,7 +765,9 @@ static u8 bus_structure_fixup(u8 busno)
(l != 0x) && (l != 0x)) {
debug("%s - Inside bus_structure_fixup()\n",
__func__);
-   pci_scan_bus(busno, ibmphp_pci_bus->ops, NULL);
+   b = pci_scan_bus(busno, ibmphp_pci_bus->ops, NULL);
+   if (b)
+   pci_bus_add_devices(b);
break;
}
}
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 23212f8..053c0f4 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2123,7 +2123,6 

[PATCH 04/28] PCI: Remove deprecated pci_scan_bus_parented()

2015-01-15 Thread Yijing Wang
No one uses pci_scan_bus_parented() any more,
remove it.

Signed-off-by: Yijing Wang 
---
 drivers/pci/probe.c |   19 ---
 include/linux/pci.h |2 --
 2 files changed, 0 insertions(+), 21 deletions(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 7cf577f..e44de73 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2091,25 +2091,6 @@ struct pci_bus *pci_scan_root_bus(struct device *parent, 
int bus,
 }
 EXPORT_SYMBOL(pci_scan_root_bus);
 
-/* Deprecated; use pci_scan_root_bus() instead */
-struct pci_bus *pci_scan_bus_parented(struct device *parent,
-   int bus, struct pci_ops *ops, void *sysdata)
-{
-   LIST_HEAD(resources);
-   struct pci_bus *b;
-
-   pci_add_resource(, _resource);
-   pci_add_resource(, _resource);
-   pci_add_resource(, _resource);
-   b = pci_create_root_bus(parent, bus, ops, sysdata, );
-   if (b)
-   pci_scan_child_bus(b);
-   else
-   pci_free_resource_list();
-   return b;
-}
-EXPORT_SYMBOL(pci_scan_bus_parented);
-
 struct pci_bus *pci_scan_bus(int bus, struct pci_ops *ops,
void *sysdata)
 {
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 360a966..55b2c81 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -769,8 +769,6 @@ void pcibios_bus_to_resource(struct pci_bus *bus, struct 
resource *res,
 void pcibios_scan_specific_bus(int busn);
 struct pci_bus *pci_find_bus(int domain, int busnr);
 void pci_bus_add_devices(const struct pci_bus *bus);
-struct pci_bus *pci_scan_bus_parented(struct device *parent, int bus,
- struct pci_ops *ops, void *sysdata);
 struct pci_bus *pci_scan_bus(int bus, struct pci_ops *ops, void *sysdata);
 struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
struct pci_ops *ops, void *sysdata,
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 06/28] PCI: Combine PCI domain and bus number in u32 arg

2015-01-15 Thread Yijing Wang
Currently, we use int type for bus number in
pci_create_root_bus(), pci_scan_root_bus() and
pci_scan_bus_legacy. Because PCI bus number
always <= 255, so we could change the bus number
argument type to u32, and combine PCI domain and
bus number in one. Also add a domain member in
pci_host_bridge to save domain number. Finally,
we could eliminate lots of the platform specific
pci_domain_nr() in the last of the series.

Signed-off-by: Yijing Wang 
---
 drivers/pci/probe.c  |   16 +---
 include/linux/pci.h  |7 ---
 include/uapi/linux/pci.h |3 +++
 3 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index ed894cb..50f58b3 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1889,7 +1889,7 @@ void __weak pcibios_remove_bus(struct pci_bus *bus)
 {
 }
 
-struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
+struct pci_bus *pci_create_root_bus(struct device *parent, u32 db,
struct pci_ops *ops, void *sysdata, struct list_head *resources)
 {
int error;
@@ -1900,6 +1900,7 @@ struct pci_bus *pci_create_root_bus(struct device 
*parent, int bus,
resource_size_t offset;
char bus_addr[64];
char *fmt;
+   u8  bus = PCI_BUSNUM(db);
 
b = pci_alloc_bus(NULL);
if (!b)
@@ -1920,6 +1921,7 @@ struct pci_bus *pci_create_root_bus(struct device 
*parent, int bus,
if (!bridge)
goto err_out;
 
+   bridge->domain = PCI_DOMAIN(db);
bridge->dev.parent = parent;
bridge->dev.release = pci_release_host_bridge_dev;
dev_set_name(>dev, "pci%04x:%02x", pci_domain_nr(b), bus);
@@ -2057,7 +2059,7 @@ void pci_bus_release_busn_res(struct pci_bus *b)
res, ret ? "can not be" : "is");
 }
 
-struct pci_bus *pci_scan_root_bus(struct device *parent, int bus,
+struct pci_bus *pci_scan_root_bus(struct device *parent, u32 db,
struct pci_ops *ops, void *sysdata, struct list_head *resources)
 {
struct pci_host_bridge_window *window;
@@ -2071,15 +2073,15 @@ struct pci_bus *pci_scan_root_bus(struct device 
*parent, int bus,
break;
}
 
-   b = pci_create_root_bus(parent, bus, ops, sysdata, resources);
+   b = pci_create_root_bus(parent, db, ops, sysdata, resources);
if (!b)
return NULL;
 
if (!found) {
dev_info(>dev,
 "No busn resource found for root bus, will use [bus 
%02x-ff]\n",
-   bus);
-   pci_bus_insert_busn_res(b, bus, 255);
+   PCI_BUSNUM(db));
+   pci_bus_insert_busn_res(b, PCI_BUSNUM(db), 255);
}
 
max = pci_scan_child_bus(b);
@@ -2091,7 +2093,7 @@ struct pci_bus *pci_scan_root_bus(struct device *parent, 
int bus,
 }
 EXPORT_SYMBOL(pci_scan_root_bus);
 
-struct pci_bus *pci_scan_bus_legacy(int bus, struct pci_ops *ops,
+struct pci_bus *pci_scan_bus_legacy(u32 db, struct pci_ops *ops,
void *sysdata)
 {
LIST_HEAD(resources);
@@ -2100,7 +2102,7 @@ struct pci_bus *pci_scan_bus_legacy(int bus, struct 
pci_ops *ops,
pci_add_resource(, _resource);
pci_add_resource(, _resource);
pci_add_resource(, _resource);
-   b = pci_create_root_bus(NULL, bus, ops, sysdata, );
+   b = pci_create_root_bus(NULL, db, ops, sysdata, );
if (b) {
pci_scan_child_bus(b);
} else {
diff --git a/include/linux/pci.h b/include/linux/pci.h
index a6fa2f1..c771508 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -402,6 +402,7 @@ struct pci_host_bridge_window {
 };
 
 struct pci_host_bridge {
+   u16 domain;
struct device dev;
struct pci_bus *bus;/* root bus */
struct list_head windows;   /* pci_host_bridge_windows */
@@ -769,14 +770,14 @@ void pcibios_bus_to_resource(struct pci_bus *bus, struct 
resource *res,
 void pcibios_scan_specific_bus(int busn);
 struct pci_bus *pci_find_bus(int domain, int busnr);
 void pci_bus_add_devices(const struct pci_bus *bus);
-struct pci_bus *pci_scan_bus_legacy(int bus, struct pci_ops *ops, void 
*sysdata);
-struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
+struct pci_bus *pci_scan_bus_legacy(u32 dombus, struct pci_ops *ops, void 
*sysdata);
+struct pci_bus *pci_create_root_bus(struct device *parent, u32 dombus,
struct pci_ops *ops, void *sysdata,
struct list_head *resources);
 int pci_bus_insert_busn_res(struct pci_bus *b, int bus, int busmax);
 int pci_bus_update_busn_res_end(struct pci_bus *b, int busmax);
 void pci_bus_release_busn_res(struct pci_bus *b);
-struct pci_bus *pci_scan_root_bus(struct device *parent, int bus,
+struct pci_bus *pci_scan_root_bus(struct device *parent, u32 dombus,
 

Re: [PATCH v2 1/2] mm/slub: optimize alloc/free fastpath by removing preemption on/off

2015-01-15 Thread Steven Rostedt
On Thu, 15 Jan 2015 21:27:14 -0600 (CST)
Christoph Lameter  wrote:

> 
> The %gs register is not used since the address of the per cpu area is
> available as one of the first fields in the per cpu areas.

Have you disassembled your code?

Looking at put_cpu_partial() from 3.19-rc3 where it does:

oldpage = this_cpu_read(s->cpu_slab->partial);

I get:

mov%gs:0x18(%rax),%rdx

Looks to me that %gs is used.


I haven't done benchmarks in a while, so perhaps accessing the %gs
segment isn't as expensive as I saw it before. I'll have to profile
function tracing on my i7 and see where things are slow again.


-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 00/28] Refine PCI scan interfaces and make generic pci host bridge

2015-01-15 Thread Yijing Wang
This series is based on Bjorn's pci-next branch.

Patch 1-4 ripped out pci_bus_add_devices() from pci scan interfaces
for better pci scan flow.

Patch 5-11 make a generic pci_host_bridge to hold pci_host_bridge
related informations, and introduce a pci_host_bridge_ops to 
support platform host drivers provide its own pci_host_bridge 
related operations to setup pci_host_bridge during pci enumeration.

Patch 12-28 apply the new pci scan interfaces to platform pci host
bridge drivers.

Now in kernel, we scan pci bus use the following ways:
1. pci_scan_bus. 
   parent = NULL, default io/mem/bus resources
   call pci_bus_add_devices()

2. pci_scan_bus_parented() + pci_bus_add_devices()
   default io/mem/bus resources, only used by xen

3. pci_scan_root_bus() + pci_bus_add_devices()

4. pci_create_root_bus() + pci_scan_child_bus() + pci_bus_add_devices()

5. pci_create_root_bus() + xx_of_scan_bus()  +  pci_bus_add_devices()

And we have a lot of arch specific pci_domain_nr() and other platform
specific weak function like pcibios_root_bridge_prepare().

After applied this series, we have following scan interfaces:

1. pci_scan_bus_legacy() 
   parent = NULL, default io/mem/bus resources.
   for legacy pci scan

2. pci_scan_root_bus()
   for callers provide its own parent and io/mem/bus resources
   but no platform specific pci_host_bridge operations

3. pci_scan_root_bridge()
   for callers provide its own parent and io/mem/bus resources
   and pci_host_bridge_ops.

Besides, above pci scan interfaces all need addtionally call 
pci_bus_add_devices()
to set match_driver true and try to attach drivers.

Also we could eliminate all arch specific pci_domain_nr() after applied this 
series.

I tested this series on x86 (with or without ACPI).
Comments and tests are warmly welcome!

Arnd Bergmann (1):
  xen/PCI: Don't use deprecated function pci_scan_bus_parented()

Yijing Wang (27):
  PCI: Rip out pci_bus_add_devices() from pci_scan_bus()
  PCI: Rip out pci_bus_add_devices() from pci_scan_root_bus()
  PCI: Remove deprecated pci_scan_bus_parented()
  PCI: Rename pci_scan_bus() to pci_scan_bus_legacy()
  PCI: Combine PCI domain and bus number in u32 arg
  PCI: Pass PCI domain number combined with root bus number
  PCI: Introduce pci_host_assign_domain_nr() to assign domain
  PCI: Separate pci_host_bridge creation out of pci_create_root_bus()
  PCI: Save sysdata in pci_host_bridge drvdata
  PCI: Introduce pci_host_bridge_ops to setup host bridge
  PCI: Introduce new scan function pci_scan_root_bridge()
  PCI/x86: Refine pci_acpi_scan_root() with generic pci_host_bridge
  PCI/IA64: Refine pci_acpi_scan_root() with generic pci_host_bridge
  PCI/powerpc: Rename pcibios_root_bridge_prepare() for better
readability
  PCI/powerpc: Use pci_scan_root_bridge() for simplicity
  PCI: Remove weak pcibios_root_bridge_prepare()
  PCI/sparc: Use pci_scan_root_bridge() for simplicity
  PCI: Introduce pci_bus_child_max_busnr()
  PCI/Parisc: Use pci_scan_root_bus() for simplicity
  PCI/mvebu: Use pci_common_init_dev() to simplify code
  PCI/tegra: Remove redundant tegra_pcie_scan_bus()
  PCI/designware: Use pci_scan_root_bus() for simplicity
  PCI/xgene: Use pci_scan_root_bus() instead of pci_create_root_bus()
  PCI: Rename __pci_create_root_bus() to pci_create_root_bus()
  PCI: Export find_pci_host_bridge()
  PCI: Remove platform specific pci_domain_nr()
  PCI: Remove pci_bus_assign_domain_nr()

 arch/alpha/include/asm/pci.h |2 -
 arch/alpha/kernel/pci.c  |7 +-
 arch/alpha/kernel/sys_nautilus.c |4 +-
 arch/frv/mb93090-mb00/pci-vdk.c  |6 +-
 arch/ia64/include/asm/pci.h  |1 -
 arch/ia64/pci/pci.c  |   34 +++---
 arch/ia64/sn/kernel/io_init.c|6 +-
 arch/m68k/coldfire/pci.c |3 +-
 arch/microblaze/pci/pci-common.c |   17 +--
 arch/mips/include/asm/pci.h  |2 -
 arch/mips/pci/pci.c  |5 +-
 arch/mn10300/unit-asb2305/pci.c  |5 +-
 arch/powerpc/include/asm/machdep.h   |2 +-
 arch/powerpc/kernel/pci-common.c |   73 ++-
 arch/powerpc/platforms/pseries/pci.c |2 +-
 arch/powerpc/platforms/pseries/pseries.h |2 +-
 arch/powerpc/platforms/pseries/setup.c   |2 +-
 arch/s390/pci/pci.c  |   13 +--
 arch/sh/drivers/pci/pci.c|6 +-
 arch/sh/include/asm/pci.h|2 -
 arch/sparc/kernel/leon_pci.c |1 +
 arch/sparc/kernel/pci.c  |   48 
 arch/sparc/kernel/pcic.c |4 +-
 arch/tile/include/asm/pci.h  |2 -
 arch/tile/kernel/pci.c   |6 +-
 arch/tile/kernel/pci_gx.c|7 +-
 arch/unicore32/kernel/pci.c  |   13 +--
 arch/x86/include/asm/pci.h   |6 -
 arch/x86/pci/acpi.c  |   38 +++---
 arch/x86/pci/common.c   

[PATCH 18/28] PCI/sparc: Use pci_scan_root_bridge() for simplicity

2015-01-15 Thread Yijing Wang
Now we could use pci_scan_root_bridge() to scan
pci buses, provide sparc specific pci_host_bridge_ops.

Signed-off-by: Yijing Wang 
---
 arch/sparc/kernel/pci.c |   30 --
 1 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/arch/sparc/kernel/pci.c b/arch/sparc/kernel/pci.c
index d798b42..42dc21f 100644
--- a/arch/sparc/kernel/pci.c
+++ b/arch/sparc/kernel/pci.c
@@ -647,12 +647,31 @@ static void pci_claim_bus_resources(struct pci_bus *bus)
pci_claim_bus_resources(child_bus);
 }
 
+static void pci_host_bridge_probe_mode(
+   struct pci_host_bridge *host)
+{
+   host->of_scan = true;
+}
+
+static void pci_host_bridge_of_scan_bus(
+   struct pci_host_bridge *host)
+{
+   struct pci_pbm_info *pbm = dev_get_drvdata(>dev);
+   struct device_node *node = pbm->op->dev.of_node;
+
+   pci_of_scan_bus(pbm, node, host->bus);
+}
+
+static struct pci_host_bridge_ops phb_ops = {
+   .phb_of_scan_bus,
+};
+
 struct pci_bus *pci_scan_one_pbm(struct pci_pbm_info *pbm,
 struct device *parent)
 {
LIST_HEAD(resources);
struct device_node *node = pbm->op->dev.of_node;
-   struct pci_bus *bus;
+   struct pci_host_bridge *host;
 
printk("PCI: Scanning PBM %s\n", node->full_name);
 
@@ -664,17 +683,16 @@ struct pci_bus *pci_scan_one_pbm(struct pci_pbm_info *pbm,
pbm->busn.end   = pbm->pci_last_busno;
pbm->busn.flags = IORESOURCE_BUS;
pci_add_resource(, >busn);
-   bus = pci_create_root_bus(parent, 
+   host = pci_scan_root_bridge(parent,
PCI_DOMBUS(pbm->index, pbm->pci_first_busno), 
-   pbm->pci_ops, pbm, );
-   if (!bus) {
-   printk(KERN_ERR "Failed to create bus for %s\n",
+   pbm->pci_ops, pbm, , _ops);
+   if (!host) {
+   printk(KERN_ERR "Failed to create host bridge for %s\n",
   node->full_name);
pci_free_resource_list();
return NULL;
}
 
-   pci_of_scan_bus(pbm, node, bus);
pci_bus_add_devices(bus);
pci_bus_register_of_sysfs(bus);
 
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 02/28] PCI: Rip out pci_bus_add_devices() from pci_scan_root_bus()

2015-01-15 Thread Yijing Wang
Just like pci_scan_bus(), we also should rip out
pci_bus_add_devices() from pci_scan_root_bus().
Lots platforms first call pci_scan_root_bus(), but
after that, they call pci_bus_size_bridges() and
pci_bus_assign_resources(). Place pci_bus_add_devices()
in pci_scan_root_bus() hurts PCI scan logic.
For arm hw_pci->scan() functions which call
pci_scan_root_bus(), it's no need to change anything,
because pci_bus_add_devices() will be called later
in pci_common_init_dev().

Signed-off-by: Yijing Wang 
CC: David Howells 
CC: Tony Luck 
CC: Michal Simek 
CC: Ralf Baechle 
CC: Koichi Yasutake 
CC: Sebastian Ott 
CC: Thomas Gleixner 
CC: Chris Metcalf 
CC: Chris Zankel 
CC: linux-al...@vger.kernel.org
CC: linux-kernel@vger.kernel.org
CC: linux-m...@linux-mips.org
CC: linux-am33-l...@redhat.com
CC: linux-s...@vger.kernel.org
CC: linux...@vger.kernel.org
CC: sparcli...@vger.kernel.org
CC: linux-...@vger.kernel.org
CC: linux-xte...@linux-xtensa.org
---
 arch/alpha/kernel/pci.c  |2 ++
 arch/frv/mb93090-mb00/pci-vdk.c  |6 --
 arch/ia64/sn/kernel/io_init.c|1 +
 arch/microblaze/pci/pci-common.c |1 +
 arch/mips/pci/pci.c  |1 +
 arch/mn10300/unit-asb2305/pci.c  |5 -
 arch/s390/pci/pci.c  |2 +-
 arch/sh/drivers/pci/pci.c|1 +
 arch/sparc/kernel/leon_pci.c |1 +
 arch/tile/kernel/pci.c   |2 ++
 arch/tile/kernel/pci_gx.c|2 ++
 arch/x86/pci/common.c|1 +
 arch/xtensa/kernel/pci.c |2 ++
 drivers/pci/probe.c  |1 -
 14 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/arch/alpha/kernel/pci.c b/arch/alpha/kernel/pci.c
index 076c35c..97f9730 100644
--- a/arch/alpha/kernel/pci.c
+++ b/arch/alpha/kernel/pci.c
@@ -334,6 +334,8 @@ common_init_pci(void)
 
bus = pci_scan_root_bus(NULL, next_busno, alpha_mv.pci_ops,
hose, );
+   if (bus)
+   pci_bus_add_devices(bus);
hose->bus = bus;
hose->need_domain_info = need_domain_info;
next_busno = bus->busn_res.end + 1;
diff --git a/arch/frv/mb93090-mb00/pci-vdk.c b/arch/frv/mb93090-mb00/pci-vdk.c
index efa5d65..2b36044 100644
--- a/arch/frv/mb93090-mb00/pci-vdk.c
+++ b/arch/frv/mb93090-mb00/pci-vdk.c
@@ -316,6 +316,7 @@ void pcibios_fixup_bus(struct pci_bus *bus)
 
 int __init pcibios_init(void)
 {
+   struct pci_bus *bus;
struct pci_ops *dir = NULL;
LIST_HEAD(resources);
 
@@ -383,12 +384,13 @@ int __init pcibios_init(void)
printk("PCI: Probing PCI hardware\n");
pci_add_resource(, _ioport_resource);
pci_add_resource(, _iomem_resource);
-   pci_scan_root_bus(NULL, 0, pci_root_ops, NULL, );
+   bus = pci_scan_root_bus(NULL, 0, pci_root_ops, NULL, );
 
pcibios_irq_init();
pcibios_fixup_irqs();
pcibios_resource_survey();
-
+   if (bus)
+   pci_bus_add_devices(bus);
return 0;
 }
 
diff --git a/arch/ia64/sn/kernel/io_init.c b/arch/ia64/sn/kernel/io_init.c
index 0b5ce82..63b43a6 100644
--- a/arch/ia64/sn/kernel/io_init.c
+++ b/arch/ia64/sn/kernel/io_init.c
@@ -272,6 +272,7 @@ sn_pci_controller_fixup(int segment, int busnum, struct 
pci_bus *bus)
kfree(res);
kfree(controller);
}
+   pci_bus_add_devices(bus);
 }
 
 /*
diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c
index b30e41c..009b271 100644
--- a/arch/microblaze/pci/pci-common.c
+++ b/arch/microblaze/pci/pci-common.c
@@ -1351,6 +1351,7 @@ static void pcibios_scan_phb(struct pci_controller *hose)
hose->bus = bus;
 
hose->last_busno = bus->busn_res.end;
+   pci_bus_add_devices(bus);
 }
 
 static int __init pcibios_init(void)
diff --git a/arch/mips/pci/pci.c b/arch/mips/pci/pci.c
index 1bf60b1..9eb54b5 100644
--- a/arch/mips/pci/pci.c
+++ b/arch/mips/pci/pci.c
@@ -114,6 +114,7 @@ static void pcibios_scanbus(struct pci_controller *hose)
pci_bus_size_bridges(bus);
pci_bus_assign_resources(bus);
}
+   pci_bus_add_devices(bus);
}
 }
 
diff --git a/arch/mn10300/unit-asb2305/pci.c b/arch/mn10300/unit-asb2305/pci.c
index 6b4339f..860aa35 100644
--- a/arch/mn10300/unit-asb2305/pci.c
+++ b/arch/mn10300/unit-asb2305/pci.c
@@ -345,6 +345,7 @@ void pcibios_fixup_bus(struct pci_bus *bus)
  */
 static int __init pcibios_init(void)
 {
+   struct pci_bus *bus;
resource_size_t io_offset, mem_offset;
LIST_HEAD(resources);
 
@@ -376,11 +377,13 @@ static int __init pcibios_init(void)
 
pci_add_resource_offset(, _ioport_resource, io_offset);
pci_add_resource_offset(, _iomem_resource, mem_offset);
-   pci_scan_root_bus(NULL, 0, _direct_ampci, NULL, );
+   bus = pci_scan_root_bus(NULL, 0, _direct_ampci, NULL, );
 
pcibios_irq_init();

[PATCH 08/28] PCI: Introduce pci_host_assign_domain_nr() to assign domain

2015-01-15 Thread Yijing Wang
Introduce pci_host_assign_domain_nr() to assign domain
number for pci_host_bridge. Later we will remove
pci_bus_assign_domain_nr().

Signed-off-by: Yijing Wang 
---
 drivers/pci/pci.c   |   47 +++
 include/linux/pci.h |4 
 2 files changed, 51 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index c419554..8b35e8e 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4443,6 +4443,53 @@ int pci_get_new_domain_nr(void)
 }
 
 #ifdef CONFIG_PCI_DOMAINS_GENERIC
+void pci_host_assign_domain_nr(struct pci_host_bridge *host)
+{
+   static int use_dt_domains = -1;
+   struct device *parent = host->dev.parent;
+   int domain = of_get_pci_domain_nr(parent->of_node);
+
+   /*
+* Check DT domain and use_dt_domains values.
+*
+* If DT domain property is valid (domain >= 0) and
+* use_dt_domains != 0, the DT assignment is valid since this means
+* we have not previously allocated a domain number by using
+* pci_get_new_domain_nr(); we should also update use_dt_domains to
+* 1, to indicate that we have just assigned a domain number from
+* DT.
+*
+* If DT domain property value is not valid (ie domain < 0), and we
+* have not previously assigned a domain number from DT
+* (use_dt_domains != 1) we should assign a domain number by
+* using the:
+*
+* pci_get_new_domain_nr()
+*
+* API and update the use_dt_domains value to keep track of method we
+* are using to assign domain numbers (use_dt_domains = 0).
+*
+* All other combinations imply we have a platform that is trying
+* to mix domain numbers obtained from DT and pci_get_new_domain_nr(),
+* which is a recipe for domain mishandling and it is prevented by
+* invalidating the domain value (domain = -1) and printing a
+* corresponding error.
+*/
+   if (domain >= 0 && use_dt_domains) {
+   use_dt_domains = 1;
+   } else if (domain < 0 && use_dt_domains != 1) {
+   use_dt_domains = 0;
+   domain = pci_get_new_domain_nr();
+   } else {
+   dev_err(parent, "Node %s has inconsistent \"linux,pci-domain\" 
property in DT\n",
+   parent->of_node->full_name);
+   domain = -1;
+   }
+
+   host->domain_nr = domain;
+}
+
+
 void pci_bus_assign_domain_nr(struct pci_bus *bus, struct device *parent)
 {
static int use_dt_domains = -1;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index c771508..1b9c799 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1316,11 +1316,15 @@ static inline int pci_domain_nr(struct pci_bus *bus)
return bus->domain_nr;
 }
 void pci_bus_assign_domain_nr(struct pci_bus *bus, struct device *parent);
+void pci_host_assign_domain_nr(struct pci_host_bridge *host);
 #else
 static inline void pci_bus_assign_domain_nr(struct pci_bus *bus,
struct device *parent)
 {
 }
+static inline void pci_host_assign_domain_nr(struct pci_host_bridge *host)
+{
+}
 #endif
 
 /* some architectures require additional setup to direct VGA traffic */
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 07/28] PCI: Pass PCI domain number combined with root bus number

2015-01-15 Thread Yijing Wang
Now we could pass PCI domain combined with bus number
in u32 argu. Because in arm/arm64, PCI domain number
is assigned by pci_bus_assign_domain_nr(). So we leave
pci_scan_root_bus() and pci_create_root_bus() in arm/arm64
unchanged. A new function pci_host_assign_domain_nr()
will be introduced for arm/arm64 to assign domain number
in later patch.

Signed-off-by: Yijing Wang 
---
 arch/alpha/kernel/pci.c  |5 +++--
 arch/alpha/kernel/sys_nautilus.c |3 ++-
 arch/ia64/pci/pci.c  |4 ++--
 arch/ia64/sn/kernel/io_init.c|5 +++--
 arch/microblaze/pci/pci-common.c |5 +++--
 arch/mips/pci/pci.c  |4 ++--
 arch/powerpc/kernel/pci-common.c |5 +++--
 arch/s390/pci/pci.c  |5 +++--
 arch/sh/drivers/pci/pci.c|5 +++--
 arch/sparc/kernel/pci.c  |5 +++--
 arch/tile/kernel/pci.c   |4 ++--
 arch/tile/kernel/pci_gx.c|5 +++--
 arch/x86/pci/acpi.c  |6 +++---
 arch/x86/pci/common.c|3 ++-
 drivers/pci/xen-pcifront.c   |5 +++--
 15 files changed, 40 insertions(+), 29 deletions(-)

diff --git a/arch/alpha/kernel/pci.c b/arch/alpha/kernel/pci.c
index 97f9730..b15f9f2 100644
--- a/arch/alpha/kernel/pci.c
+++ b/arch/alpha/kernel/pci.c
@@ -332,8 +332,9 @@ common_init_pci(void)
pci_add_resource_offset(, hose->mem_space,
hose->mem_space->start);
 
-   bus = pci_scan_root_bus(NULL, next_busno, alpha_mv.pci_ops,
-   hose, );
+   bus = pci_scan_root_bus(NULL,
+   PCI_DOMBUS(hose->index, next_busno), 
alpha_mv.pci_ops,
+   hose, );
if (bus)
pci_bus_add_devices(bus);
hose->bus = bus;
diff --git a/arch/alpha/kernel/sys_nautilus.c b/arch/alpha/kernel/sys_nautilus.c
index 2c864bb..f7bfdf3 100644
--- a/arch/alpha/kernel/sys_nautilus.c
+++ b/arch/alpha/kernel/sys_nautilus.c
@@ -206,7 +206,8 @@ nautilus_init_pci(void)
unsigned long memtop = max_low_pfn << PAGE_SHIFT;
 
/* Scan our single hose.  */
-   bus = pci_scan_bus_legacy(0, alpha_mv.pci_ops, hose);
+   bus = pci_scan_bus_legacy(PCI_DOMBUS(hose->index, 0),
+   alpha_mv.pci_ops, hose);
hose->bus = bus;
pcibios_claim_one_bus(bus);
 
diff --git a/arch/ia64/pci/pci.c b/arch/ia64/pci/pci.c
index 291a582..e457015 100644
--- a/arch/ia64/pci/pci.c
+++ b/arch/ia64/pci/pci.c
@@ -465,8 +465,8 @@ struct pci_bus *pci_acpi_scan_root(struct acpi_pci_root 
*root)
 * should handle the case here, but it appears that IA64 hasn't
 * such quirk. So we just ignore the case now.
 */
-   pbus = pci_create_root_bus(NULL, bus, _root_ops, controller,
-  >resources);
+   pbus = pci_create_root_bus(NULL, PCI_DOMBUS(domain, bus),
+   _root_ops, controller, >resources);
if (!pbus) {
pci_free_resource_list(>resources);
__release_pci_root_info(info);
diff --git a/arch/ia64/sn/kernel/io_init.c b/arch/ia64/sn/kernel/io_init.c
index 63b43a6..bcdc5b8 100644
--- a/arch/ia64/sn/kernel/io_init.c
+++ b/arch/ia64/sn/kernel/io_init.c
@@ -266,8 +266,9 @@ sn_pci_controller_fixup(int segment, int busnum, struct 
pci_bus *bus)
pci_add_resource_offset(, [1],
prom_bussoft_ptr->bs_legacy_mem);
 
-   bus = pci_scan_root_bus(NULL, busnum, _root_ops, controller,
-   );
+   bus = pci_scan_root_bus(NULL,
+   PCI_DOMBUS(controller->segment, busnum),
+   _root_ops, controller, );
if (bus == NULL) {
kfree(res);
kfree(controller);
diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c
index 009b271..890bd36 100644
--- a/arch/microblaze/pci/pci-common.c
+++ b/arch/microblaze/pci/pci-common.c
@@ -1339,8 +1339,9 @@ static void pcibios_scan_phb(struct pci_controller *hose)
 
pcibios_setup_phb_resources(hose, );
 
-   bus = pci_scan_root_bus(hose->parent, hose->first_busno,
-   hose->ops, hose, );
+   bus = pci_scan_root_bus(hose->parent,
+   PCI_DOMBUS(hose->global_number, hose->first_busno),
+   hose->ops, hose, );
if (bus == NULL) {
pr_err("Failed to create bus for PCI domain %04x\n",
   hose->global_number);
diff --git a/arch/mips/pci/pci.c b/arch/mips/pci/pci.c
index 9eb54b5..980755a 100644
--- a/arch/mips/pci/pci.c
+++ b/arch/mips/pci/pci.c
@@ -92,8 +92,8 @@ static void pcibios_scanbus(struct pci_controller *hose)
pci_add_resource_offset(,
hose->mem_resource, hose->mem_offset);
pci_add_resource_offset(, hose->io_resource, 

[PATCH 21/28] PCI/mvebu: Use pci_common_init_dev() to simplify code

2015-01-15 Thread Yijing Wang
Mvebu_pcie_scan_bus() is not necessary, we could use
pci_common_init_dev() instead of pci_common_init(),
and pass the device pointer as the parent. Then
pci_scan_root_bus() will be called to scan the pci busses.

Signed-off-by: Yijing Wang 
---
 drivers/pci/host/pci-mvebu.c |   18 +-
 1 files changed, 1 insertions(+), 17 deletions(-)

diff --git a/drivers/pci/host/pci-mvebu.c b/drivers/pci/host/pci-mvebu.c
index 1309cfb..d5a2b70 100644
--- a/drivers/pci/host/pci-mvebu.c
+++ b/drivers/pci/host/pci-mvebu.c
@@ -750,21 +750,6 @@ static int mvebu_pcie_setup(int nr, struct pci_sys_data 
*sys)
return 1;
 }
 
-static struct pci_bus *mvebu_pcie_scan_bus(int nr, struct pci_sys_data *sys)
-{
-   struct mvebu_pcie *pcie = sys_to_pcie(sys);
-   struct pci_bus *bus;
-
-   bus = pci_create_root_bus(>pdev->dev, sys->busnr,
- _pcie_ops, sys, >resources);
-   if (!bus)
-   return NULL;
-
-   pci_scan_child_bus(bus);
-
-   return bus;
-}
-
 static resource_size_t mvebu_pcie_align_resource(struct pci_dev *dev,
 const struct resource *res,
 resource_size_t start,
@@ -808,12 +793,11 @@ static void mvebu_pcie_enable(struct mvebu_pcie *pcie)
hw.nr_controllers = 1;
hw.private_data   = (void **)
hw.setup  = mvebu_pcie_setup;
-   hw.scan   = mvebu_pcie_scan_bus;
hw.map_irq= of_irq_parse_and_map_pci;
hw.ops= _pcie_ops;
hw.align_resource = mvebu_pcie_align_resource;
 
-   pci_common_init();
+   pci_common_init_dev(>pdev->dev, );
 }
 
 /*
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 22/28] PCI/tegra: Remove redundant tegra_pcie_scan_bus()

2015-01-15 Thread Yijing Wang
Now pci_scan_root_bus() is almost simliar to
pci_create_root_bus() + pci_scan_child_bus().
So we could use common pci_scan_root_bus() in
pci_common_init_dev() to scan pci busses.
tegra_pcie_scan_bus() is redundant, remove it.

Signed-off-by: Yijing Wang 
---
 drivers/pci/host/pci-tegra.c |   15 ---
 1 files changed, 0 insertions(+), 15 deletions(-)

diff --git a/drivers/pci/host/pci-tegra.c b/drivers/pci/host/pci-tegra.c
index 6f9c29f..d9d1af0 100644
--- a/drivers/pci/host/pci-tegra.c
+++ b/drivers/pci/host/pci-tegra.c
@@ -679,21 +679,6 @@ static int tegra_pcie_map_irq(const struct pci_dev *pdev, 
u8 slot, u8 pin)
return irq;
 }
 
-static struct pci_bus *tegra_pcie_scan_bus(int nr, struct pci_sys_data *sys)
-{
-   struct tegra_pcie *pcie = sys_to_pcie(sys);
-   struct pci_bus *bus;
-
-   bus = pci_create_root_bus(pcie->dev, sys->busnr, _pcie_ops, sys,
- >resources);
-   if (!bus)
-   return NULL;
-
-   pci_scan_child_bus(bus);
-
-   return bus;
-}
-
 static irqreturn_t tegra_pcie_isr(int irq, void *arg)
 {
const char *err_msg[] = {
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 20/28] PCI/Parisc: Use pci_scan_root_bus() for simplicity

2015-01-15 Thread Yijing Wang
From: Yijing Wang 

Now pci_bus_add_devices() has been ripped out
from pci_scan_root_bus(), we could use pci_scan_root_bus()
instead of pci_create_root_bus() + pci_scan_child_bus()
for simplicity.

Signed-off-by: Yijing Wang 
---
 drivers/parisc/dino.c|   11 ++-
 drivers/parisc/lba_pci.c |6 ++
 2 files changed, 4 insertions(+), 13 deletions(-)

diff --git a/drivers/parisc/dino.c b/drivers/parisc/dino.c
index a0580af..e5ee339 100644
--- a/drivers/parisc/dino.c
+++ b/drivers/parisc/dino.c
@@ -977,15 +977,11 @@ static int __init dino_probe(struct parisc_device *dev)
if (dino_dev->hba.gmmio_space.flags)
pci_add_resource(, _dev->hba.gmmio_space);
 
-   dino_dev->hba.bus_num.start = dino_current_bus;
-   dino_dev->hba.bus_num.end = 255;
-   dino_dev->hba.bus_num.flags = IORESOURCE_BUS;
-   pci_add_resource(, _dev->hba.bus_num);
/*
** It's not used to avoid chicken/egg problems
** with configuration accessor functions.
*/
-   dino_dev->hba.hba_bus = bus = pci_create_root_bus(>dev,
+   dino_dev->hba.hba_bus = bus = pci_scan_root_bus(>dev,
 dino_current_bus, _cfg_ops, NULL, );
if (!bus) {
printk(KERN_ERR "ERROR: failed to scan PCI bus on %s (duplicate 
bus number %d?)\n",
@@ -996,13 +992,10 @@ static int __init dino_probe(struct parisc_device *dev)
return 0;
}
 
-   max = pci_scan_child_bus(bus);
-   pci_bus_update_busn_res_end(bus, max);
-
/* This code *depends* on scanning being single threaded
 * if it isn't, this global bus number count will fail
 */
-   dino_current_bus = max + 1;
+   dino_current_bus = bus->busn_res.end + 1;
pci_bus_assign_resources(bus);
pci_bus_add_devices(bus);
return 0;
diff --git a/drivers/parisc/lba_pci.c b/drivers/parisc/lba_pci.c
index 37e71ff..9e3a016 100644
--- a/drivers/parisc/lba_pci.c
+++ b/drivers/parisc/lba_pci.c
@@ -1564,15 +1564,13 @@ lba_driver_probe(struct parisc_device *dev)
 
dev->dev.platform_data = lba_dev;
lba_bus = lba_dev->hba.hba_bus =
-   pci_create_root_bus(>dev, lba_dev->hba.bus_num.start,
+   pci_scan_root_bus(>dev, lba_dev->hba.bus_num.start,
cfg_ops, NULL, );
if (!lba_bus) {
pci_free_resource_list();
return 0;
}
 
-   max = pci_scan_child_bus(lba_bus);
-
/* This is in lieu of calling pci_assign_unassigned_resources() */
if (is_pdc_pat()) {
/* assign resources to un-initialized devices */
@@ -1600,7 +1598,7 @@ lba_driver_probe(struct parisc_device *dev)
lba_dev->flags |= LBA_FLAG_SKIP_PROBE;
}
 
-   lba_next_bus = max + 1;
+   lba_next_bus = pci_bus_child_max_busnr(lba_bus) + 1;
pci_bus_add_devices(lba_bus);
 
/* Whew! Finally done! Tell services we got this one covered. */
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 16/28] PCI/powerpc: Use pci_scan_root_bridge() for simplicity

2015-01-15 Thread Yijing Wang
Now we could use pci_scan_root_bridge() to scan
pci buses, provide powerpc specific pci_host_bridge_ops.

Signed-off-by: Yijing Wang 
---
 arch/powerpc/kernel/pci-common.c |   57 --
 1 files changed, 36 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index 2cf941e..0c7fb81 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -767,7 +767,8 @@ int pci_proc_domain(struct pci_bus *bus)
return 1;
 }
 
-int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge)
+static int pci_host_bridge_set_root_bus_speed(
+   struct pci_host_bridge *bridge)
 {
if (ppc_md.pcibios_set_root_bus_speed)
return ppc_md.pcibios_set_root_bus_speed(bridge);
@@ -775,6 +776,27 @@ int pcibios_root_bridge_prepare(struct pci_host_bridge 
*bridge)
return 0;
 }
 
+static void pci_host_bridge_probe_mode(struct pci_host_bridge *host)
+{
+   int mode = PCI_PROBE_NORMAL;
+   struct pci_bus *bus = host->bus;
+   struct pci_controller *hose = dev_get_drvdata(>dev);
+
+   /* Get probe mode and perform scan */
+   if (hose->dn && ppc_md.pci_probe_mode)
+   mode = ppc_md.pci_probe_mode(bus);
+
+   pr_debug("probe mode: %d\n", mode);
+   if (mode == PCI_PROBE_DEVTREE)
+   host->of_scan = true;
+}
+
+static void pci_host_bridge_of_scan_bus(struct pci_host_bridge *host)
+{
+   struct pci_controller *hose = dev_get_drvdata(>dev);
+
+   of_scan_bus(host->dn, bus);
+}
 /* This header fixup will do the resource fixup for all devices as they are
  * probed, but not for bridge ranges
  */
@@ -1577,6 +1599,12 @@ struct device_node *pcibios_get_phb_of_node(struct 
pci_bus *bus)
return of_node_get(hose->dn);
 }
 
+static struct pci_host_bridge_ops *phb_ops = {
+   .phb_set_root_bus_speed = pci_host_bridge_set_root_bus_speed,
+   .phb_probe_mode = pci_host_bridge_probe_mode,
+   .phb_of_scan_bus = pci_host_bridge_of_scan_bus,
+};
+
 /**
  * pci_scan_phb - Given a pci_controller, setup and scan the PCI bus
  * @hose: Pointer to the PCI host controller instance structure
@@ -1584,9 +1612,9 @@ struct device_node *pcibios_get_phb_of_node(struct 
pci_bus *bus)
 void pcibios_scan_phb(struct pci_controller *hose)
 {
LIST_HEAD(resources);
+   struct pci_host_bridge *host;
struct pci_bus *bus;
struct device_node *node = hose->dn;
-   int mode;
 
pr_debug("PCI: Scanning PHB %s\n", of_node_full_name(node));
 
@@ -1602,30 +1630,17 @@ void pcibios_scan_phb(struct pci_controller *hose)
pci_add_resource(, >busn);
 
/* Create an empty bus for the toplevel */
-   bus = pci_create_root_bus(hose->parent, 
+   host = pci_scan_root_bridge(hose->parent,
PCI_DOMBUS(hose->global_number, hose->first_busno),
-   hose->ops, hose, );
-   if (bus == NULL) {
-   pr_err("Failed to create bus for PCI domain %04x\n",
+   hose->ops, hose, , _ops);
+   if (host == NULL) {
+   pr_err("Failed to create host bridge for PCI domain %04x\n",
hose->global_number);
pci_free_resource_list();
return;
}
-   hose->bus = bus;
-
-   /* Get probe mode and perform scan */
-   mode = PCI_PROBE_NORMAL;
-   if (node && ppc_md.pci_probe_mode)
-   mode = ppc_md.pci_probe_mode(bus);
-   pr_debug("probe mode: %d\n", mode);
-   if (mode == PCI_PROBE_DEVTREE)
-   of_scan_bus(node, bus);
-
-   if (mode == PCI_PROBE_NORMAL) {
-   pci_bus_update_busn_res_end(bus, 255);
-   hose->last_busno = pci_scan_child_bus(bus);
-   pci_bus_update_busn_res_end(bus, hose->last_busno);
-   }
+   hose->bus = host->bus;
+   hose->last_busno = host->bus->busn_res.end;
 
/* Platform gets a chance to do some global fixups before
 * we proceed to resource allocation
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 09/28] PCI: Separate pci_host_bridge creation out of pci_create_root_bus()

2015-01-15 Thread Yijing Wang
We want to make a generic pci_host_bridge, then we could
place common PCI infos like domain number in it. Ripping
out pci_host_bridge creation from pci_create_root_bus()
make code more better readability. Further more, we could
use the generic pci_host_bridge to hold host bridge specific
operations like pcibios_root_bridge_prepare().

Signed-off-by: Yijing Wang 
---
 drivers/pci/host-bridge.c |   78 +
 drivers/pci/probe.c   |  121 +++--
 include/linux/pci.h   |6 ++-
 3 files changed, 134 insertions(+), 71 deletions(-)

diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c
index 0e5f3c9..c9ee582 100644
--- a/drivers/pci/host-bridge.c
+++ b/drivers/pci/host-bridge.c
@@ -8,6 +8,84 @@
 
 #include "pci.h"
 
+static LIST_HEAD(pci_host_bridge_list);
+static DEFINE_MUTEX(phb_mutex);
+
+static void pci_release_host_bridge_dev(struct device *dev)
+{
+   struct pci_host_bridge *bridge = to_pci_host_bridge(dev);
+
+   if (bridge->release_fn)
+   bridge->release_fn(bridge);
+
+   pci_free_resource_list(>windows);
+   kfree(bridge);
+}
+
+struct pci_host_bridge *pci_create_host_bridge(
+   struct device *parent, u32 db, struct list_head *resources)
+{
+   int error;
+   int bus = PCI_BUSNUM(db);
+   int domain = PCI_DOMAIN(db);
+   struct pci_host_bridge *host, *temp;
+   struct pci_host_bridge_window *window, *n;
+
+   host = kzalloc(sizeof(*host), GFP_KERNEL);
+   if (!host)
+   return NULL;
+
+   host->busnum = bus;
+   host->domain = domain;
+   /* If support CONFIG_PCI_DOMAINS_GENERIC, use
+* pci_host_assign_domain_nr() to assign domain
+* number instead argu u32 db.
+*/
+   pci_host_assign_domain_nr(host);
+
+   mutex_lock(_mutex);
+   list_for_each_entry(temp, _host_bridge_list, list)
+   if (temp->domain == host->domain
+   && host->busnum == temp->busnum) {
+   dev_dbg(>dev, "pci host bridge pci%04x:%02x 
exist\n",
+   host->domain, host->busnum);
+   mutex_unlock(_mutex);
+   kfree(host);
+   return NULL;
+   }
+   mutex_unlock(_mutex);
+
+   host->dev.parent = parent;
+   INIT_LIST_HEAD(>windows);
+   host->dev.release = pci_release_host_bridge_dev;
+   dev_set_name(>dev, "pci%04x:%02x", host->domain,
+   host->busnum);
+
+   error = device_register(>dev);
+   if (error) {
+   put_device(>dev);
+   return NULL;
+   }
+
+   list_for_each_entry_safe(window, n, resources, list)
+   list_move_tail(>list, >windows);
+
+   mutex_lock(_mutex);
+   list_add_tail(>list, _host_bridge_list);
+   mutex_unlock(_mutex);
+   return host;
+}
+EXPORT_SYMBOL(pci_create_host_bridge);
+
+void pci_free_host_bridge(struct pci_host_bridge *host)
+{
+   mutex_lock(_mutex);
+   list_del(>list);
+   mutex_unlock(_mutex);
+
+   device_unregister(>dev);
+}
+
 static struct pci_bus *find_pci_root_bus(struct pci_bus *bus)
 {
while (bus->parent)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 50f58b3..2e0b952 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -502,31 +502,6 @@ static struct pci_bus *pci_alloc_bus(struct pci_bus 
*parent)
return b;
 }
 
-static void pci_release_host_bridge_dev(struct device *dev)
-{
-   struct pci_host_bridge *bridge = to_pci_host_bridge(dev);
-
-   if (bridge->release_fn)
-   bridge->release_fn(bridge);
-
-   pci_free_resource_list(>windows);
-
-   kfree(bridge);
-}
-
-static struct pci_host_bridge *pci_alloc_host_bridge(struct pci_bus *b)
-{
-   struct pci_host_bridge *bridge;
-
-   bridge = kzalloc(sizeof(*bridge), GFP_KERNEL);
-   if (!bridge)
-   return NULL;
-
-   INIT_LIST_HEAD(>windows);
-   bridge->bus = b;
-   return bridge;
-}
-
 static const unsigned char pcix_bus_speed[] = {
PCI_SPEED_UNKNOWN,  /* 0 */
PCI_SPEED_66MHz_PCIX,   /* 1 */
@@ -1889,54 +1864,35 @@ void __weak pcibios_remove_bus(struct pci_bus *bus)
 {
 }
 
-struct pci_bus *pci_create_root_bus(struct device *parent, u32 db,
-   struct pci_ops *ops, void *sysdata, struct list_head *resources)
+static struct pci_bus *__pci_create_root_bus(
+   struct pci_host_bridge *bridge, struct pci_ops *ops,
+   void *sysdata)
 {
int error;
-   struct pci_host_bridge *bridge;
-   struct pci_bus *b, *b2;
-   struct pci_host_bridge_window *window, *n;
+   struct pci_bus *b;
+   struct pci_host_bridge_window *window;
+   struct device *parent;
struct resource *res;
resource_size_t offset;
char bus_addr[64];
char 

Re: [PATCH v2 2/2] mm: don't use compound_head() in virt_to_head_page()

2015-01-15 Thread Christoph Lameter
On Thu, 15 Jan 2015, Andrew Morton wrote:

> And perhaps some explanation here as to why virt_to_head_page() can
> safely use compound_head_fast().  There's an assumption here that
> nobody will be dismantling the compound page while virt_to_head_page()
> is in progress, yes?  And this assumption also holds for the calling
> code, because otherwise the virt_to_head_page() return value is kinda
> meaningless.

I think this assumption is pretty natural to make. A coupound_head that
works well while dismantling a compound page should be marked specially
and Joonsoo's definition should be the standard.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] ARM: clk-imx6q: refine esai_ipg's parent

2015-01-15 Thread Shengjiu Wang
esai_ipg clock's parent is ahb, not ipg.

Signed-off-by: Shengjiu Wang 
---
 arch/arm/mach-imx/clk-imx6q.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/mach-imx/clk-imx6q.c b/arch/arm/mach-imx/clk-imx6q.c
index 2daef61..d04a430 100644
--- a/arch/arm/mach-imx/clk-imx6q.c
+++ b/arch/arm/mach-imx/clk-imx6q.c
@@ -386,7 +386,7 @@ static void __init imx6q_clocks_init(struct device_node 
*ccm_node)
clk[IMX6Q_CLK_ECSPI5] = imx_clk_gate2("ecspi5",
"ecspi_root",base + 0x6c, 8);
clk[IMX6QDL_CLK_ENET] = imx_clk_gate2("enet",  "ipg",   
base + 0x6c, 10);
clk[IMX6QDL_CLK_ESAI_EXTAL]   = imx_clk_gate2_shared("esai_extal",   
"esai_podf",   base + 0x6c, 16, _count_esai);
-   clk[IMX6QDL_CLK_ESAI_IPG] = imx_clk_gate2_shared("esai_ipg",   
"ipg",   base + 0x6c, 16, _count_esai);
+   clk[IMX6QDL_CLK_ESAI_IPG] = imx_clk_gate2_shared("esai_ipg",   
"ahb",   base + 0x6c, 16, _count_esai);
clk[IMX6QDL_CLK_ESAI_MEM] = imx_clk_gate2_shared("esai_mem", "ahb", 
base + 0x6c, 16, _count_esai);
clk[IMX6QDL_CLK_GPT_IPG]  = imx_clk_gate2("gpt_ipg",   "ipg",   
base + 0x6c, 20);
clk[IMX6QDL_CLK_GPT_IPG_PER]  = imx_clk_gate2("gpt_ipg_per",   
"ipg_per",   base + 0x6c, 22);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >