Re: [PATCH net-next] rcu: prevent RCU_LOCKDEP_WARN() from swallowing the condition

2020-09-14 Thread Joel Fernandes
On Mon, Sep 14, 2020 at 03:47:38PM -0700, Jakub Kicinski wrote:
> On Mon, 14 Sep 2020 16:21:22 -0400 Joel Fernandes wrote:
> > On Tue, Sep 08, 2020 at 05:27:51PM -0700, Jakub Kicinski wrote:
> > > On Tue, 08 Sep 2020 21:15:56 +0300 niko...@cumulusnetworks.com wrote:  
> > > > Ah, you want to solve it for all. :) 
> > > > Looks and sounds good to me, 
> > > > Reviewed-by: Nikolay Aleksandrov   
> > > 
> > > Actually, I give up, lockdep_is_held() is not defined without
> > > CONFIG_LOCKDEP, let's just go with your patch..  
> > 
> > Care to send a patch just for the RCU macro then? Not sure what Dave is
> > applying but if the net-next tree is not taking the RCU macro change, then
> > send another one with my tag:
> 
> Seems like quite a few places depend on the macro disappearing its
> argument. I was concerned that it's going to be had to pick out whether
> !LOCKDEP builds should return true or false from LOCKDEP helpers, but
> perhaps relying on the linker errors even more is not such poor taste?
> 
> Does the patch below look acceptable to you?
> 
> --->8
> 
> rcu: prevent RCU_LOCKDEP_WARN() from swallowing the condition
> 
> We run into a unused variable warning in bridge code when
> variable is only used inside the condition of
> rcu_dereference_protected().
> 
>  #define mlock_dereference(X, br) \
>   rcu_dereference_protected(X, lockdep_is_held(&br->multicast_lock))
> 
> Since on builds with CONFIG_PROVE_RCU=n rcu_dereference_protected()
> compiles to nothing the compiler doesn't see the variable use.
> 
> Prevent the warning by adding the condition as dead code.
> We need to un-hide the declaration of lockdep_tasklist_lock_is_held(),
> lockdep_sock_is_held(), RCU lock maps and remove some declarations
> in net/sched header, because they have a wrong type.
> 
> Add forward declarations of lockdep_is_held(), lock_is_held() which
> will cause a linker errors if actually used with !LOCKDEP.
> At least RCU expects some locks _not_ to be held so it's hard to
> pick true/false for a dummy implementation.
> 
> Signed-off-by: Jakub Kicinski 
> ---
>  include/linux/lockdep.h|  6 ++
>  include/linux/rcupdate.h   | 11 ++-
>  include/linux/rcupdate_trace.h |  4 ++--
>  include/linux/sched/task.h |  2 --
>  include/net/sch_generic.h  | 12 
>  include/net/sock.h |  2 --

Would it make sense to split it into individual patches?

So 1 for rcu, 1 for lockdep and then 1 for networking. The lockdep ones may
need PeterZ's ack.

thanks,

 - Joel


>  6 files changed, 14 insertions(+), 23 deletions(-)
> 
> diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
> index 6a584b3e5c74..c4b6225ee320 100644
> --- a/include/linux/lockdep.h
> +++ b/include/linux/lockdep.h
> @@ -371,6 +371,12 @@ static inline void lockdep_unregister_key(struct 
> lock_class_key *key)
>  
>  #define lockdep_depth(tsk)   (0)
>  
> +/*
> + * Dummy forward declarations, allow users to write less ifdef-y code
> + * and depend on dead code elimination.
> + */
> +int lock_is_held(const void *);
> +int lockdep_is_held(const void *);
>  #define lockdep_is_held_type(l, r)   (1)
>  
>  #define lockdep_assert_held(l)   do { (void)(l); } while 
> (0)
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index d15d46db61f7..50d45781fa99 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -234,6 +234,11 @@ bool rcu_lockdep_current_cpu_online(void);
>  static inline bool rcu_lockdep_current_cpu_online(void) { return true; }
>  #endif /* #else #if defined(CONFIG_HOTPLUG_CPU) && defined(CONFIG_PROVE_RCU) 
> */
>  
> +extern struct lockdep_map rcu_lock_map;
> +extern struct lockdep_map rcu_bh_lock_map;
> +extern struct lockdep_map rcu_sched_lock_map;
> +extern struct lockdep_map rcu_callback_map;
> +
>  #ifdef CONFIG_DEBUG_LOCK_ALLOC
>  
>  static inline void rcu_lock_acquire(struct lockdep_map *map)
> @@ -246,10 +251,6 @@ static inline void rcu_lock_release(struct lockdep_map 
> *map)
>   lock_release(map, _THIS_IP_);
>  }
>  
> -extern struct lockdep_map rcu_lock_map;
> -extern struct lockdep_map rcu_bh_lock_map;
> -extern struct lockdep_map rcu_sched_lock_map;
> -extern struct lockdep_map rcu_callback_map;
>  int debug_lockdep_rcu_enabled(void);
>  int rcu_read_lock_held(void);
>  int rcu_read_lock_bh_held(void);
> @@ -320,7 +321,7 @@ static inline void rcu_preempt_sleep_check(void) { }
>  
>  #else /* #ifdef CONFIG_PROVE_RCU */
>  
> -#define RCU_LOCKDEP_WARN(c, s

Re: [PATCH net-next] rcu: prevent RCU_LOCKDEP_WARN() from swallowing the condition

2020-09-14 Thread Joel Fernandes
On Tue, Sep 08, 2020 at 05:27:51PM -0700, Jakub Kicinski wrote:
> On Tue, 08 Sep 2020 21:15:56 +0300 niko...@cumulusnetworks.com wrote:
> > Ah, you want to solve it for all. :) 
> > Looks and sounds good to me, 
> > Reviewed-by: Nikolay Aleksandrov 
> 
> Actually, I give up, lockdep_is_held() is not defined without
> CONFIG_LOCKDEP, let's just go with your patch..

Care to send a patch just for the RCU macro then? Not sure what Dave is
applying but if the net-next tree is not taking the RCU macro change, then
send another one with my tag:

Reviewed-by: Joel Fernandes (Google) 

thanks!

 - Joel



Re: [ovs-discuss] Double free in recent kernels after memleak fix

2020-08-10 Thread Joel Fernandes
On Fri, Aug 07, 2020 at 03:20:15PM -0700, Paul E. McKenney wrote:
> On Fri, Aug 07, 2020 at 04:47:56PM -0400, Joel Fernandes wrote:
> > Hi,
> > Adding more of us working on RCU as well. Johan from another team at
> > Google discovered a likely issue in openswitch, details below:
> > 
> > On Fri, Aug 7, 2020 at 11:32 AM Johan Knöös  wrote:
> > >
> > > On Tue, Aug 4, 2020 at 8:52 AM Gregory Rose  wrote:
> > > >
> > > >
> > > >
> > > > On 8/3/2020 12:01 PM, Johan Knöös via discuss wrote:
> > > > > Hi Open vSwitch contributors,
> > > > >
> > > > > We have found openvswitch is causing double-freeing of memory. The
> > > > > issue was not present in kernel version 5.5.17 but is present in
> > > > > 5.6.14 and newer kernels.
> > > > >
> > > > > After reverting the RCU commits below for debugging, enabling
> > > > > slub_debug, lockdep, and KASAN, we see the warnings at the end of this
> > > > > email in the kernel log (the last one shows the double-free). When I
> > > > > revert 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 ("net: openvswitch:
> > > > > fix possible memleak on destroy flow-table"), the symptoms disappear.
> > > > > While I have a reliable way to reproduce the issue, I unfortunately
> > > > > don't yet have a process that's amenable to sharing. Please take a
> > > > > look.
> > > > >
> > > > > 189a6883dcf7 rcu: Remove kfree_call_rcu_nobatch()
> > > > > 77a40f97030b rcu: Remove kfree_rcu() special casing and lazy-callback 
> > > > > handling
> > > > > e99637becb2e rcu: Add support for debug_objects debugging for 
> > > > > kfree_rcu()
> > > > > 0392bebebf26 rcu: Add multiple in-flight batches of kfree_rcu() work
> > > > > 569d767087ef rcu: Make kfree_rcu() use a non-atomic ->monitor_todo
> > > > > a35d16905efc rcu: Add basic support for kfree_rcu() batching
> > 
> > Note that these reverts were only for testing the same code, because
> > he was testing 2 different kernel versions. One of them did not have
> > this set. So I asked him to revert. There's no known bug in the
> > reverted code itself. But somehow these patches do make it harder for
> > him to reproduce the issue.
> 
> Perhaps they adjust timing?

Yes that could be it. In my testing (which is unrelated to OVS), the issue
happens only with TREE02. I can reproduce the issue in [1] on just boot-up of
TREE02.

I could have screwed up something in my segcblist count patch, any hints
would be great. I'll dig more into it as well.

> > 
> > But then again, I have not heard reports of this warning firing. Paul,
> > has this come to your radar recently?
> 
> I have not seen any recent WARNs in rcu_do_batch().  I am guessing that
> this is one of the last two in that function?
> 
> If so, have you tried using CONFIG_DEBUG_OBJECTS_RCU_HEAD=y?  That Kconfig
> option is designed to help locate double frees via RCU.

Yes true, kfree_rcu() also has support for this. Jonathan, did you get a
chance to try this out in your failure scenario?

thanks,

 - Joel

[1] https://lore.kernel.org/lkml/20200720005334.GC19262@shao2-debian/


Re: [ovs-discuss] Double free in recent kernels after memleak fix

2020-08-07 Thread Joel Fernandes
On Fri, Aug 7, 2020 at 4:47 PM Joel Fernandes  wrote:
>
> Hi,
> Adding more of us working on RCU as well. Johan from another team at
> Google discovered a likely issue in openswitch, details below:
>
> On Fri, Aug 7, 2020 at 11:32 AM Johan Knöös  wrote:
> >
> > On Tue, Aug 4, 2020 at 8:52 AM Gregory Rose  wrote:
> > >
> > >
> > >
> > > On 8/3/2020 12:01 PM, Johan Knöös via discuss wrote:
> > > > Hi Open vSwitch contributors,
> > > >
> > > > We have found openvswitch is causing double-freeing of memory. The
> > > > issue was not present in kernel version 5.5.17 but is present in
> > > > 5.6.14 and newer kernels.
> > > >
> > > > After reverting the RCU commits below for debugging, enabling
> > > > slub_debug, lockdep, and KASAN, we see the warnings at the end of this
> > > > email in the kernel log (the last one shows the double-free). When I
> > > > revert 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 ("net: openvswitch:
> > > > fix possible memleak on destroy flow-table"), the symptoms disappear.
> > > > While I have a reliable way to reproduce the issue, I unfortunately
> > > > don't yet have a process that's amenable to sharing. Please take a
> > > > look.
> > > >
> > > > 189a6883dcf7 rcu: Remove kfree_call_rcu_nobatch()
> > > > 77a40f97030b rcu: Remove kfree_rcu() special casing and lazy-callback 
> > > > handling
> > > > e99637becb2e rcu: Add support for debug_objects debugging for 
> > > > kfree_rcu()
> > > > 0392bebebf26 rcu: Add multiple in-flight batches of kfree_rcu() work
> > > > 569d767087ef rcu: Make kfree_rcu() use a non-atomic ->monitor_todo
> > > > a35d16905efc rcu: Add basic support for kfree_rcu() batching
> > > >
>
> Note that these reverts were only for testing the same code, because
> he was testing 2 different kernel versions. One of them did not have
> this set. So I asked him to revert. There's no known bug in the
> reverted code itself. But somehow these patches do make it harder for
> him to reproduce the issue.

And the reason for this is likely the additional kfree batching is
slowing down the occurrence of the crash.

thanks,

 - Joel


Re: [ovs-discuss] Double free in recent kernels after memleak fix

2020-08-07 Thread Joel Fernandes
Hi,
Adding more of us working on RCU as well. Johan from another team at
Google discovered a likely issue in openswitch, details below:

On Fri, Aug 7, 2020 at 11:32 AM Johan Knöös  wrote:
>
> On Tue, Aug 4, 2020 at 8:52 AM Gregory Rose  wrote:
> >
> >
> >
> > On 8/3/2020 12:01 PM, Johan Knöös via discuss wrote:
> > > Hi Open vSwitch contributors,
> > >
> > > We have found openvswitch is causing double-freeing of memory. The
> > > issue was not present in kernel version 5.5.17 but is present in
> > > 5.6.14 and newer kernels.
> > >
> > > After reverting the RCU commits below for debugging, enabling
> > > slub_debug, lockdep, and KASAN, we see the warnings at the end of this
> > > email in the kernel log (the last one shows the double-free). When I
> > > revert 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 ("net: openvswitch:
> > > fix possible memleak on destroy flow-table"), the symptoms disappear.
> > > While I have a reliable way to reproduce the issue, I unfortunately
> > > don't yet have a process that's amenable to sharing. Please take a
> > > look.
> > >
> > > 189a6883dcf7 rcu: Remove kfree_call_rcu_nobatch()
> > > 77a40f97030b rcu: Remove kfree_rcu() special casing and lazy-callback 
> > > handling
> > > e99637becb2e rcu: Add support for debug_objects debugging for kfree_rcu()
> > > 0392bebebf26 rcu: Add multiple in-flight batches of kfree_rcu() work
> > > 569d767087ef rcu: Make kfree_rcu() use a non-atomic ->monitor_todo
> > > a35d16905efc rcu: Add basic support for kfree_rcu() batching
> > >

Note that these reverts were only for testing the same code, because
he was testing 2 different kernel versions. One of them did not have
this set. So I asked him to revert. There's no known bug in the
reverted code itself. But somehow these patches do make it harder for
him to reproduce the issue.

> > > Thanks,
> > > Johan Knöös
> >
> > Let's add the author of the patch you reverted and the Linux netdev
> > mailing list.
> >
> > - Greg
>
> I found we also sometimes get warnings from
> https://elixir.bootlin.com/linux/v5.5.17/source/kernel/rcu/tree.c#L2239
> under similar conditions even on kernel 5.5.17, which I believe may be
> related. However, it's much rarer and I don't have a reliable way of
> reproducing it. Perhaps 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 only
> increases the frequency of a pre-existing bug.

This is interesting, because I saw kbuild warn me recently [1] about
it as well. Though, I was actually intentionally messing with the
segcblist. I plan to debug it next week, but the warning itself is
unlikely to be caused by my patch IMHO (since it is slightly
orthogonal to what I changed).

[1] https://lore.kernel.org/lkml/20200720005334.GC19262@shao2-debian/

But then again, I have not heard reports of this warning firing. Paul,
has this come to your radar recently?

Thanks,

 - Joel


Re: [PATCH linux-rcu] docs/litmus-tests: add BPF ringbuf MPSC litmus tests

2020-05-29 Thread Joel Fernandes
Hi Andrii,

On Thu, May 28, 2020 at 10:50:30PM -0700, Andrii Nakryiko wrote:
> > [...]
> > > diff --git a/Documentation/litmus-tests/bpf-rb/bpf-rb+1p1c+bounded.litmus 
> > > b/Documentation/litmus-tests/bpf-rb/bpf-rb+1p1c+bounded.litmus
> > > new file mode 100644
> > > index ..558f054fb0b4
> > > --- /dev/null
> > > +++ b/Documentation/litmus-tests/bpf-rb/bpf-rb+1p1c+bounded.litmus
> > > @@ -0,0 +1,91 @@
> > > +C bpf-rb+1p1c+bounded
> > > +
> > > +(*
> > > + * Result: Always
> > > + *
> > > + * This litmus test validates BPF ring buffer implementation under the
> > > + * following assumptions:
> > > + * - 1 producer;
> > > + * - 1 consumer;
> > > + * - ring buffer has capacity for only 1 record.
> > > + *
> > > + * Expectations:
> > > + * - 1 record pushed into ring buffer;
> > > + * - 0 or 1 element is consumed.
> > > + * - no failures.
> > > + *)
> > > +
> > > +{
> > > + atomic_t dropped;
> > > +}
> > > +
> > > +P0(int *lenFail, int *len1, int *cx, int *px)
> > > +{
> > > + int *rLenPtr;
> > > + int rLen;
> > > + int rPx;
> > > + int rCx;
> > > + int rFail;
> > > +
> > > + rFail = 0;
> > > +
> > > + rCx = smp_load_acquire(cx);
> > > + rPx = smp_load_acquire(px);
> >
> > Is it possible for you to put some more comments around which ACQUIRE is
> > paired with which RELEASE? And, in general more comments around the reason
> > for a certain memory barrier and what pairs with what. In the kernel 
> > sources,
> > the barriers needs a comment anyway.

This was the comment earlier that was missed.

> > > + if (rCx < rPx) {
> > > + if (rCx == 0) {
> > > + rLenPtr = len1;
> > > + } else {
> > > + rLenPtr = lenFail;
> > > + rFail = 1;
> > > + }
> > > +
> > > + rLen = smp_load_acquire(rLenPtr);
> > > + if (rLen == 0) {
> > > + rFail = 1;
> > > + } else if (rLen == 1) {
> > > + rCx = rCx + 1;
> > > + smp_store_release(cx, rCx);
> > > + }
> > > + }
> > > +}
> > > +
> > > +P1(int *lenFail, int *len1, spinlock_t *rb_lock, int *px, int *cx, 
> > > atomic_t *dropped)
> > > +{
> > > + int rPx;
> > > + int rCx;
> > > + int rFail;
> > > + int *rLenPtr;
> > > +
> > > + rFail = 0;
> > > +
> > > + rCx = smp_load_acquire(cx);
> > > + spin_lock(rb_lock);
> > > +
> > > + rPx = *px;
> > > + if (rPx - rCx >= 1) {
> > > + atomic_inc(dropped);
> >
> > Why does 'dropped' need to be atomic if you are always incrementing under a
> > lock?
> 
> It doesn't, strictly speaking, but making it atomic in litmus test was
> just more convenient, especially that I initially also had a lock-less
> variant of this algorithm.

Ok, that's fine.

> >
> > > + spin_unlock(rb_lock);
> > > + } else {
> > > + if (rPx == 0) {
> > > + rLenPtr = len1;
> > > + } else {
> > > + rLenPtr = lenFail;
> > > + rFail = 1;
> > > + }
> > > +
> > > + *rLenPtr = -1;
> >
> > Clarify please the need to set the length intermittently to -1. Thanks.
> 
> This corresponds to setting a "busy bit" in kernel implementation.
> These litmus tests are supposed to be correlated with in-kernel
> implementation, I'm not sure I want to maintain extra 4 copies of
> comments here and in kernel code. Especially for 2-producer cases,
> there are 2 identical P1 and P2, which is unfortunate, but I haven't
> figured out how to have a re-usable pieces of code with litmus tests
> :)

I disagree that comments related to memory ordering are optional. IMHO, the
documentation should be clear from a memory ordering standpoint. After all,
good Documentation/ always clarifies something / some concept to the reader
right? :-) Please have mercy on me, I am just trying to learn *your*
Documentation ;-)

> > > diff --git a/Documentation/litmus-tests/bpf-rb/bpf-rb+2p1c+bounded.litmus 
> > > b/Documentation/litmus-tests/bpf-rb/bpf-rb+2p1c+bounded.litmus
[...]
> > > +P1(int *lenFail, int *len1, spinlock_t *rb_lock, int *px, int *cx, 
> > > atomic_t *dropped)
> > > +{
> > > + int rPx;
> > > + int rCx;
> > > + int rFail;
> > > + int *rLenPtr;
> > > +
> > > + rFail = 0;
> > > + rLenPtr = lenFail;
> > > +
> > > + rCx = smp_load_acquire(cx);
> > > + spin_lock(rb_lock);
> > > +
> > > + rPx = *px;
> > > + if (rPx - rCx >= 1) {
> > > + atomic_inc(dropped);
> > > + spin_unlock(rb_lock);
> > > + } else {
> > > + if (rPx == 0) {
> > > + rLenPtr = len1;
> > > + } else if (rPx == 1) {
> > > + rLenPtr = len1;
> > > + } else {
> > > + rLenPtr = lenFail;
> > > + rFail = 1;
> > > + }
> > > +
> > > + 

Re: [PATCH linux-rcu] docs/litmus-tests: add BPF ringbuf MPSC litmus tests

2020-05-28 Thread Joel Fernandes
Hello Andrii,
This is quite exciting. Some comments below:

On Wed, May 27, 2020 at 11:24:08PM -0700, Andrii Nakryiko wrote:
[...]
> diff --git a/Documentation/litmus-tests/bpf-rb/bpf-rb+1p1c+bounded.litmus 
> b/Documentation/litmus-tests/bpf-rb/bpf-rb+1p1c+bounded.litmus
> new file mode 100644
> index ..558f054fb0b4
> --- /dev/null
> +++ b/Documentation/litmus-tests/bpf-rb/bpf-rb+1p1c+bounded.litmus
> @@ -0,0 +1,91 @@
> +C bpf-rb+1p1c+bounded
> +
> +(*
> + * Result: Always
> + *
> + * This litmus test validates BPF ring buffer implementation under the
> + * following assumptions:
> + * - 1 producer;
> + * - 1 consumer;
> + * - ring buffer has capacity for only 1 record.
> + *
> + * Expectations:
> + * - 1 record pushed into ring buffer;
> + * - 0 or 1 element is consumed.
> + * - no failures.
> + *)
> +
> +{
> + atomic_t dropped;
> +}
> +
> +P0(int *lenFail, int *len1, int *cx, int *px)
> +{
> + int *rLenPtr;
> + int rLen;
> + int rPx;
> + int rCx;
> + int rFail;
> +
> + rFail = 0;
> +
> + rCx = smp_load_acquire(cx);
> + rPx = smp_load_acquire(px);

Is it possible for you to put some more comments around which ACQUIRE is
paired with which RELEASE? And, in general more comments around the reason
for a certain memory barrier and what pairs with what. In the kernel sources,
the barriers needs a comment anyway.

> + if (rCx < rPx) {
> + if (rCx == 0) {
> + rLenPtr = len1;
> + } else {
> + rLenPtr = lenFail;
> + rFail = 1;
> + }
> +
> + rLen = smp_load_acquire(rLenPtr);
> + if (rLen == 0) {
> + rFail = 1;
> + } else if (rLen == 1) {
> + rCx = rCx + 1;
> + smp_store_release(cx, rCx);
> + }
> + }
> +}
> +
> +P1(int *lenFail, int *len1, spinlock_t *rb_lock, int *px, int *cx, atomic_t 
> *dropped)
> +{
> + int rPx;
> + int rCx;
> + int rFail;
> + int *rLenPtr;
> +
> + rFail = 0;
> +
> + rCx = smp_load_acquire(cx);
> + spin_lock(rb_lock);
> +
> + rPx = *px;
> + if (rPx - rCx >= 1) {
> + atomic_inc(dropped);

Why does 'dropped' need to be atomic if you are always incrementing under a
lock?

> + spin_unlock(rb_lock);
> + } else {
> + if (rPx == 0) {
> + rLenPtr = len1;
> + } else {
> + rLenPtr = lenFail;
> + rFail = 1;
> + }
> +
> + *rLenPtr = -1;

Clarify please the need to set the length intermittently to -1. Thanks.

> + smp_store_release(px, rPx + 1);
> +
> + spin_unlock(rb_lock);
> +
> + smp_store_release(rLenPtr, 1);
> + }
> +}
> +
> +exists (
> + 0:rFail=0 /\ 1:rFail=0
> + /\
> + (
> + (dropped=0 /\ px=1 /\ len1=1 /\ (cx=0 \/ cx=1))
> + )
> +)
> diff --git a/Documentation/litmus-tests/bpf-rb/bpf-rb+1p1c+unbound.litmus 
> b/Documentation/litmus-tests/bpf-rb/bpf-rb+1p1c+unbound.litmus
> new file mode 100644
> index ..7ab5d0e6e49f
> --- /dev/null
> +++ b/Documentation/litmus-tests/bpf-rb/bpf-rb+1p1c+unbound.litmus

I wish there was a way to pass args to litmus tests, then perhaps it would
have been possible to condense some of these tests. :-)

> diff --git a/Documentation/litmus-tests/bpf-rb/bpf-rb+2p1c+bounded.litmus 
> b/Documentation/litmus-tests/bpf-rb/bpf-rb+2p1c+bounded.litmus
> new file mode 100644
> index ..83f80328c92b
> --- /dev/null
> +++ b/Documentation/litmus-tests/bpf-rb/bpf-rb+2p1c+bounded.litmus
[...]
> +P0(int *lenFail, int *len1, int *cx, int *px)
> +{
> + int *rLenPtr;
> + int rLen;
> + int rPx;
> + int rCx;
> + int rFail;
> +
> + rFail = 0;
> +
> + rCx = smp_load_acquire(cx);
> + rPx = smp_load_acquire(px);
> + if (rCx < rPx) {
> + if (rCx == 0) {
> + rLenPtr = len1;
> + } else if (rCx == 1) {
> + rLenPtr = len1;
> + } else {
> + rLenPtr = lenFail;
> + rFail = 1;
> + }
> +
> + rLen = smp_load_acquire(rLenPtr);
> + if (rLen == 0) {
> + rFail = 1;
> + } else if (rLen == 1) {
> + rCx = rCx + 1;
> + smp_store_release(cx, rCx);
> + }
> + }
> +
> + rPx = smp_load_acquire(px);
> + if (rCx < rPx) {
> + if (rCx == 0) {
> + rLenPtr = len1;
> + } else if (rCx == 1) {
> + rLenPtr = len1;
> + } else {
> + rLenPtr = lenFail;
> + rFail = 1;
> + }
> +
> + rLen = smp_load_acquire(rLenPtr);
> + if (rLen == 0) {
> + rFail = 1;
> + } else 

Re: samples/bpf compilation failures - 5.2.0

2019-06-27 Thread Joel Fernandes
> > On 5/28/2019 2:27 PM, Srinivas Ramana wrote:
> > > Hello,
> > >
> > > I am trying to build samples/bpf in kernel(5.2.0-rc1) but unsuccessful
> > > with below errors. Can you help to point what i am missing or if there
> > > is some known issue?

By the way have you just tried building it on an ARM debian chroot? It
is not worth IMO spending time on cross compiler issues if you can
just native compile it within a chroot (as I do). Cross compilation
does not get a lot of testing, so even if we fix it its likely to come
up again as I've experienced. If you want a debian chroot that is
Android friendly, you can find one here:
https://github.com/joelagnel/adeb (comes with llvm, gcc etc).  I have
done lots of native compilation on a Pixel phone.

J.



> > >
> > > ==8<===
> > > $ make samples/bpf/
> > > LLC=/local/mnt/workspace/tools/clang_ubuntu/clang/clang+llvm-8.0.0-x86_64-linux-gnu-ubuntu-14.04/bin/llc
> > > CLANG=/local/mnt/workspace/tools/clang_ubuntu/clang/clang+llvm-8.0.0-x86_64-linux-gnu-ubuntu-14.04/bin/clang
> > > V=1
> > > make -C /local/mnt/workspace/sramana/kdev_torvalds/kdev/kernel -f
> > > /local/mnt/workspace/sramana/kdev_torvalds/kdev/kernel/Makefile
> > > samples/bpf/
> > > 
> > > 
> > > 
> > > make KBUILD_MODULES=1 -f ./scripts/Makefile.build obj=samples/bpf
> > > (cat /dev/null; ) > samples/bpf/modules.order
> > > make -C
> > > /local/mnt/workspace/sramana/kdev_torvalds/kdev/kernel/samples/bpf/../../tools/lib/bpf/
> > > RM='rm -rf' LDFLAGS=
> > > srctree=/local/mnt/workspace/sramana/kdev_torvalds/kdev/kernel/samples/bpf/../../
> > > O=
> > >
> > > Auto-detecting system features:
> > > ...libelf: [ on  ]
> > > ...   bpf: [ on  ]
> > >
> > > make -C
> > > /local/mnt/workspace/sramana/kdev_torvalds/kdev/kernel/samples/bpf/../..//tools/build
> > > CFLAGS= LDFLAGS= fixdep
> > > make -f
> > > /local/mnt/workspace/sramana/kdev_torvalds/kdev/kernel/samples/bpf/../..//tools/build/Makefile.build
> > > dir=. obj=fixdep
> > > ld -r -o fixdep-in.o  fixdep.o
> > > ld: fixdep.o: Relocations in generic ELF (EM: 183)
> > > ld: fixdep.o: Relocations in generic ELF (EM: 183)
> > > fixdep.o: error adding symbols: File in wrong format
> > > make[5]: *** [fixdep-in.o] Error 1
> > > make[4]: *** [fixdep-in.o] Error 2
> > > make[3]: *** [fixdep] Error 2
> > > make[2]: ***
> > > [/local/mnt/workspace/sramana/kdev_torvalds/kdev/kernel/samples/bpf/../../tools/lib/bpf/libbpf.a]
> > > Error 2
> > > make[1]: *** [samples/bpf/] Error 2
> > > make: *** [sub-make] Error 2
> > > ==>8===
> > >
> > >
> > > I am using the below commands to build:
> > > 
> > > export ARCH=arm64
> > > export CROSS_COMPILE=linaro-toolchain/5.1/bin/aarch64-linux-gnu-
> > > export CLANG_TRIPLE=arm64-linux-gnu-
> > >
> > > make
> > > CC=/clang_ubuntu/clang/clang+llvm-8.0.0-x86_64-linux-gnu-ubuntu-14.04/bin/clang
> > > defconfig
> > >
> > > make
> > > CC=/clang_ubuntu/clang/clang+llvm-8.0.0-x86_64-linux-gnu-ubuntu-14.04/bin/clang
> > > -j8
> > >
> > > make
> > > CC=/clang_ubuntu/clang/clang+llvm-8.0.0-x86_64-linux-gnu-ubuntu-14.04/bin/clang
> > > headers_install INSTALL_HDR_PATH=./usr
> > >
> > > make samples/bpf/
> > > LLC=/clang_ubuntu/clang/clang+llvm-8.0.0-x86_64-linux-gnu-ubuntu-14.04/bin/llc
> > > CLANG=/clang_ubuntu/clang/clang+llvm-8.0.0-x86_64-linux-gnu-ubuntu-14.04/bin/clang
> > > V=1
> > > CC=/clang_ubuntu/clang/clang+llvm-8.0.0-x86_64-linux-gnu-ubuntu-14.04/bin/clang
> > >
> > > 
> > >
> > > Thanks,
> > > -- Srinivas R
> > >
> >
> >
> > --
> > Qualcomm India Private Limited, on behalf of Qualcomm Innovation
> > Center, Inc., is a member of Code Aurora Forum, a Linux Foundation
> > Collaborative Project


Re: samples/bpf compilation failures - 5.2.0

2019-06-26 Thread Joel Fernandes
On Wed, Jun 26, 2019 at 1:41 AM Srinivas Ramana  wrote:
>
> + Joel if he has seen this issue.
>

I have not seen this issue and it has been some time since I built BPF
samples, sorry. I am mostly building BPF programs through bcc/bpftrace
and Android's build system.

We ought to make samples easier to build though, I remember when I
built it last year there were header related issues which should be
fixed. It could be that new BPF features broke it again.

 J.



> On 5/28/2019 2:27 PM, Srinivas Ramana wrote:
> > Hello,
> >
> > I am trying to build samples/bpf in kernel(5.2.0-rc1) but unsuccessful
> > with below errors. Can you help to point what i am missing or if there
> > is some known issue?
> >
> > ==8<===
> > $ make samples/bpf/
> > LLC=/local/mnt/workspace/tools/clang_ubuntu/clang/clang+llvm-8.0.0-x86_64-linux-gnu-ubuntu-14.04/bin/llc
> > CLANG=/local/mnt/workspace/tools/clang_ubuntu/clang/clang+llvm-8.0.0-x86_64-linux-gnu-ubuntu-14.04/bin/clang
> > V=1
> > make -C /local/mnt/workspace/sramana/kdev_torvalds/kdev/kernel -f
> > /local/mnt/workspace/sramana/kdev_torvalds/kdev/kernel/Makefile
> > samples/bpf/
> > 
> > 
> > 
> > make KBUILD_MODULES=1 -f ./scripts/Makefile.build obj=samples/bpf
> > (cat /dev/null; ) > samples/bpf/modules.order
> > make -C
> > /local/mnt/workspace/sramana/kdev_torvalds/kdev/kernel/samples/bpf/../../tools/lib/bpf/
> > RM='rm -rf' LDFLAGS=
> > srctree=/local/mnt/workspace/sramana/kdev_torvalds/kdev/kernel/samples/bpf/../../
> > O=
> >
> > Auto-detecting system features:
> > ...libelf: [ on  ]
> > ...   bpf: [ on  ]
> >
> > make -C
> > /local/mnt/workspace/sramana/kdev_torvalds/kdev/kernel/samples/bpf/../..//tools/build
> > CFLAGS= LDFLAGS= fixdep
> > make -f
> > /local/mnt/workspace/sramana/kdev_torvalds/kdev/kernel/samples/bpf/../..//tools/build/Makefile.build
> > dir=. obj=fixdep
> > ld -r -o fixdep-in.o  fixdep.o
> > ld: fixdep.o: Relocations in generic ELF (EM: 183)
> > ld: fixdep.o: Relocations in generic ELF (EM: 183)
> > fixdep.o: error adding symbols: File in wrong format
> > make[5]: *** [fixdep-in.o] Error 1
> > make[4]: *** [fixdep-in.o] Error 2
> > make[3]: *** [fixdep] Error 2
> > make[2]: ***
> > [/local/mnt/workspace/sramana/kdev_torvalds/kdev/kernel/samples/bpf/../../tools/lib/bpf/libbpf.a]
> > Error 2
> > make[1]: *** [samples/bpf/] Error 2
> > make: *** [sub-make] Error 2
> > ==>8===
> >
> >
> > I am using the below commands to build:
> > 
> > export ARCH=arm64
> > export CROSS_COMPILE=linaro-toolchain/5.1/bin/aarch64-linux-gnu-
> > export CLANG_TRIPLE=arm64-linux-gnu-
> >
> > make
> > CC=/clang_ubuntu/clang/clang+llvm-8.0.0-x86_64-linux-gnu-ubuntu-14.04/bin/clang
> > defconfig
> >
> > make
> > CC=/clang_ubuntu/clang/clang+llvm-8.0.0-x86_64-linux-gnu-ubuntu-14.04/bin/clang
> > -j8
> >
> > make
> > CC=/clang_ubuntu/clang/clang+llvm-8.0.0-x86_64-linux-gnu-ubuntu-14.04/bin/clang
> > headers_install INSTALL_HDR_PATH=./usr
> >
> > make samples/bpf/
> > LLC=/clang_ubuntu/clang/clang+llvm-8.0.0-x86_64-linux-gnu-ubuntu-14.04/bin/llc
> > CLANG=/clang_ubuntu/clang/clang+llvm-8.0.0-x86_64-linux-gnu-ubuntu-14.04/bin/clang
> > V=1
> > CC=/clang_ubuntu/clang/clang+llvm-8.0.0-x86_64-linux-gnu-ubuntu-14.04/bin/clang
> >
> > 
> >
> > Thanks,
> > -- Srinivas R
> >
>
>
> --
> Qualcomm India Private Limited, on behalf of Qualcomm Innovation
> Center, Inc., is a member of Code Aurora Forum, a Linux Foundation
> Collaborative Project


Re: [PATCH 1/2 v2] kprobe: Do not use uaccess functions to access kernel memory that can fault

2019-02-28 Thread Joel Fernandes
On Thu, Feb 28, 2019 at 09:29:13PM +0900, Masami Hiramatsu wrote:
> On Tue, 26 Feb 2019 10:24:47 -0500
> Joel Fernandes  wrote:
> 
> > On Fri, Feb 22, 2019 at 11:27:05AM -0800, Alexei Starovoitov wrote:
> > > On Fri, Feb 22, 2019 at 09:43:14AM -0800, Linus Torvalds wrote:
> > > > 
> > > > Then we should still probably fix up "__probe_kernel_read()" to not
> > > > allow user accesses. The easiest way to do that is actually likely to
> > > > use the "unsafe_get_user()" functions *without* doing a
> > > > uaccess_begin(), which will mean that modern CPU's will simply fault
> > > > on a kernel access to user space.
> > > 
> > > On bpf side the bpf_probe_read() helper just calls probe_kernel_read()
> > > and users pass both user and kernel addresses into it and expect
> > > that the helper will actually try to read from that address.
> > 
> > Slightly related and FWIW, BCC's eBPF-based opensnoop tool [1] installs a
> > kprobe on do_sys_open to monitor calls to the open syscall globally.
> > 
> > do_sys_open() has prototype:
> > 
> > long do_sys_open(int dfd, const char __user *filename, int flags, umode_t 
> > mode);
> > 
> > This causes a "blank" filename to be displayed by opensnoop when I run it on
> > my Pixel 3 (arm64), possibly because this is a user pointer. However, it
> > works fine on x86-64.
> > 
> > So it seems to me that on arm64, reading user pointers directly still 
> > doesn't
> > work even if there is a distinction between user/kernel addresses. In that
> > case reading the user pointer using user accessors (possibly using
> > bpf_probe_user_read helper) should be needed to fix this issue (as Yonghong
> > also privately discussed with me).
> 
> OK, it sounds like the same issue. Please add a bpf_user_read() and use it
> for __user pointer.

CC'd Yonghong who said eariler to me he would add it, but I could add it too
if he wants me to.

thanks,

 - Joel



Re: [PATCH v2 3/6] sched/cpufreq: Annotate cpufreq_update_util_data pointer with __rcu

2019-02-27 Thread Joel Fernandes
On Mon, Feb 25, 2019 at 01:10:26PM -0800, Paul E. McKenney wrote:
> On Sat, Feb 23, 2019 at 01:34:31AM -0500, Joel Fernandes (Google) wrote:
> > Recently I added an RCU annotation check to rcu_assign_pointer(). All
> > pointers assigned to RCU protected data are to be annotated with __rcu
> > inorder to be able to use rcu_assign_pointer() similar to checks in
> > other RCU APIs.
> > 
> > This resulted in a sparse error: kernel//sched/cpufreq.c:41:9: sparse:
> > error: incompatible types in comparison expression (different address
> > spaces)
> > 
> > Fix this by annotating cpufreq_update_util_data pointer with __rcu. This
> > will also help sparse catch any future RCU misuage bugs.
> > 
> > Signed-off-by: Joel Fernandes (Google) 
> 
> From an RCU perspective:
> 
> Reviewed-by: Paul E. McKenney 

Thanks a lot Paul.

Peter, Rafael, Steve,
Are you Ok with the patches 3-6? It will be nice to quiet down sparse.

thanks,

 - Joel

> > ---
> >  kernel/sched/cpufreq.c | 2 +-
> >  kernel/sched/sched.h   | 2 +-
> >  2 files changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/kernel/sched/cpufreq.c b/kernel/sched/cpufreq.c
> > index 22bd8980f32f..e316ee7bb2e5 100644
> > --- a/kernel/sched/cpufreq.c
> > +++ b/kernel/sched/cpufreq.c
> > @@ -7,7 +7,7 @@
> >   */
> >  #include "sched.h"
> >  
> > -DEFINE_PER_CPU(struct update_util_data *, cpufreq_update_util_data);
> > +DEFINE_PER_CPU(struct update_util_data __rcu *, cpufreq_update_util_data);
> >  
> >  /**
> >   * cpufreq_add_update_util_hook - Populate the CPU's update_util_data 
> > pointer.
> > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> > index d04530bf251f..2ab545d40381 100644
> > --- a/kernel/sched/sched.h
> > +++ b/kernel/sched/sched.h
> > @@ -2166,7 +2166,7 @@ static inline u64 irq_time_read(int cpu)
> >  #endif /* CONFIG_IRQ_TIME_ACCOUNTING */
> >  
> >  #ifdef CONFIG_CPU_FREQ
> > -DECLARE_PER_CPU(struct update_util_data *, cpufreq_update_util_data);
> > +DECLARE_PER_CPU(struct update_util_data __rcu *, cpufreq_update_util_data);
> >  
> >  /**
> >   * cpufreq_update_util - Take a note about CPU utilization changes.
> > -- 
> > 2.21.0.rc0.258.g878e2cd30e-goog
> > 
> 


Re: [PATCH 1/2 v2] kprobe: Do not use uaccess functions to access kernel memory that can fault

2019-02-26 Thread Joel Fernandes
On Fri, Feb 22, 2019 at 11:27:05AM -0800, Alexei Starovoitov wrote:
> On Fri, Feb 22, 2019 at 09:43:14AM -0800, Linus Torvalds wrote:
> > 
> > Then we should still probably fix up "__probe_kernel_read()" to not
> > allow user accesses. The easiest way to do that is actually likely to
> > use the "unsafe_get_user()" functions *without* doing a
> > uaccess_begin(), which will mean that modern CPU's will simply fault
> > on a kernel access to user space.
> 
> On bpf side the bpf_probe_read() helper just calls probe_kernel_read()
> and users pass both user and kernel addresses into it and expect
> that the helper will actually try to read from that address.

Slightly related and FWIW, BCC's eBPF-based opensnoop tool [1] installs a
kprobe on do_sys_open to monitor calls to the open syscall globally.

do_sys_open() has prototype:

long do_sys_open(int dfd, const char __user *filename, int flags, umode_t mode);

This causes a "blank" filename to be displayed by opensnoop when I run it on
my Pixel 3 (arm64), possibly because this is a user pointer. However, it
works fine on x86-64.

So it seems to me that on arm64, reading user pointers directly still doesn't
work even if there is a distinction between user/kernel addresses. In that
case reading the user pointer using user accessors (possibly using
bpf_probe_user_read helper) should be needed to fix this issue (as Yonghong
also privately discussed with me).

[1] https://github.com/iovisor/bcc/blob/master/tools/opensnoop.py#L140

thanks!

 - Joel


> 
> If __probe_kernel_read will suddenly start failing on all user addresses
> it will break the expectations.
> How do we solve it in bpf_probe_read?
> Call probe_kernel_read and if that fails call unsafe_get_user byte-by-byte
> in the loop?
> That's doable, but people already complain that bpf_probe_read() is slow
> and shows up in their perf report.
> 


[PATCH v2 4/6] sched_domain: Annotate RCU pointers properly

2019-02-22 Thread Joel Fernandes (Google)
The scheduler uses RCU API in various places to access sched_domain
pointers. These cause sparse errors as below.

Many new errors show up because of an annotation check I added to
rcu_assign_pointer(). Let us annotate the pointers correctly which also
will help sparse catch any potential future bugs.

This fixes the following sparse errors:

rt.c:1681:9: error: incompatible types in comparison expression
deadline.c:1904:9: error: incompatible types in comparison expression
core.c:519:9: error: incompatible types in comparison expression
core.c:1634:17: error: incompatible types in comparison expression
fair.c:6193:14: error: incompatible types in comparison expression
fair.c:9883:22: error: incompatible types in comparison expression
fair.c:9897:9: error: incompatible types in comparison expression
sched.h:1287:9: error: incompatible types in comparison expression
topology.c:612:9: error: incompatible types in comparison expression
topology.c:615:9: error: incompatible types in comparison expression
sched.h:1300:9: error: incompatible types in comparison expression
topology.c:618:9: error: incompatible types in comparison expression
sched.h:1287:9: error: incompatible types in comparison expression
topology.c:621:9: error: incompatible types in comparison expression
sched.h:1300:9: error: incompatible types in comparison expression
topology.c:624:9: error: incompatible types in comparison expression
topology.c:671:9: error: incompatible types in comparison expression
stats.c:45:17: error: incompatible types in comparison expression
fair.c:5998:15: error: incompatible types in comparison expression
fair.c:5989:15: error: incompatible types in comparison expression
fair.c:5998:15: error: incompatible types in comparison expression
fair.c:5989:15: error: incompatible types in comparison expression
fair.c:6120:19: error: incompatible types in comparison expression
fair.c:6506:14: error: incompatible types in comparison expression
fair.c:6515:14: error: incompatible types in comparison expression
fair.c:6623:9: error: incompatible types in comparison expression
fair.c:5970:17: error: incompatible types in comparison expression
fair.c:8642:21: error: incompatible types in comparison expression
fair.c:9253:9: error: incompatible types in comparison expression
fair.c:9331:9: error: incompatible types in comparison expression
fair.c:9519:15: error: incompatible types in comparison expression
fair.c:9533:14: error: incompatible types in comparison expression
fair.c:9542:14: error: incompatible types in comparison expression
fair.c:9567:14: error: incompatible types in comparison expression
fair.c:9597:14: error: incompatible types in comparison expression
fair.c:9421:16: error: incompatible types in comparison expression
fair.c:9421:16: error: incompatible types in comparison expression

Signed-off-by: Joel Fernandes (Google) 
---
 include/linux/sched/topology.h |  4 ++--
 kernel/sched/sched.h   | 14 +++---
 kernel/sched/topology.c| 10 +-
 3 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
index c31d3a47a47c..4819c9e01e42 100644
--- a/include/linux/sched/topology.h
+++ b/include/linux/sched/topology.h
@@ -76,8 +76,8 @@ struct sched_domain_shared {
 
 struct sched_domain {
/* These fields must be setup */
-   struct sched_domain *parent;/* top domain must be null terminated */
-   struct sched_domain *child; /* bottom domain must be null 
terminated */
+   struct sched_domain __rcu *parent;  /* top domain must be null 
terminated */
+   struct sched_domain __rcu *child;   /* bottom domain must be null 
terminated */
struct sched_group *groups; /* the balancing groups of the domain */
unsigned long min_interval; /* Minimum balance interval ms */
unsigned long max_interval; /* Maximum balance interval ms */
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 2ab545d40381..ca6a79f57e7a 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -866,8 +866,8 @@ struct rq {
atomic_tnr_iowait;
 
 #ifdef CONFIG_SMP
-   struct root_domain  *rd;
-   struct sched_domain *sd;
+   struct root_domain  *rd;
+   struct sched_domain __rcu   *sd;
 
unsigned long   cpu_capacity;
unsigned long   cpu_capacity_orig;
@@ -1305,13 +1305,13 @@ static inline struct sched_domain 
*lowest_flag_domain(int cpu, int flag)
return sd;
 }
 
-DECLARE_PER_CPU(struct sched_domain *, sd_llc);
+DECLARE_PER_CPU(struct sched_domain __rcu *, sd_llc);
 DECLARE_PER_CPU(int, sd_llc_size);
 DECLARE_PER_CPU(int, sd_llc_id);
-DECLARE_PER_CPU(struct sched_domain_shared *, sd_llc_shared);
-DECLARE_PER_CPU(struct sched_domain *, sd_numa);
-DECLARE_PER_CPU(struct sched_domain *, sd_asym_packing);
-DECLARE_PER_CPU(struct sched_domain *, sd_asym_cpucapacity

[PATCH v2 5/6] rcuwait: Annotate task_struct with __rcu

2019-02-22 Thread Joel Fernandes (Google)
This suppresses sparse error generated due to the recently added
rcu_assign_pointer sparse check.

percpu-rwsem.c:162:9: sparse: error: incompatible types in comparison expression
exit.c:316:16: sparse: error: incompatible types in comparison expression

Signed-off-by: Joel Fernandes (Google) 
---
 include/linux/rcuwait.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/rcuwait.h b/include/linux/rcuwait.h
index 90bfa3279a01..563290fc194f 100644
--- a/include/linux/rcuwait.h
+++ b/include/linux/rcuwait.h
@@ -18,7 +18,7 @@
  * awoken.
  */
 struct rcuwait {
-   struct task_struct *task;
+   struct task_struct __rcu *task;
 };
 
 #define __RCUWAIT_INITIALIZER(name)\
-- 
2.21.0.rc0.258.g878e2cd30e-goog



[PATCH v2 3/6] sched/cpufreq: Annotate cpufreq_update_util_data pointer with __rcu

2019-02-22 Thread Joel Fernandes (Google)
Recently I added an RCU annotation check to rcu_assign_pointer(). All
pointers assigned to RCU protected data are to be annotated with __rcu
inorder to be able to use rcu_assign_pointer() similar to checks in
other RCU APIs.

This resulted in a sparse error: kernel//sched/cpufreq.c:41:9: sparse:
error: incompatible types in comparison expression (different address
spaces)

Fix this by annotating cpufreq_update_util_data pointer with __rcu. This
will also help sparse catch any future RCU misuage bugs.

Signed-off-by: Joel Fernandes (Google) 
---
 kernel/sched/cpufreq.c | 2 +-
 kernel/sched/sched.h   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/cpufreq.c b/kernel/sched/cpufreq.c
index 22bd8980f32f..e316ee7bb2e5 100644
--- a/kernel/sched/cpufreq.c
+++ b/kernel/sched/cpufreq.c
@@ -7,7 +7,7 @@
  */
 #include "sched.h"
 
-DEFINE_PER_CPU(struct update_util_data *, cpufreq_update_util_data);
+DEFINE_PER_CPU(struct update_util_data __rcu *, cpufreq_update_util_data);
 
 /**
  * cpufreq_add_update_util_hook - Populate the CPU's update_util_data pointer.
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index d04530bf251f..2ab545d40381 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -2166,7 +2166,7 @@ static inline u64 irq_time_read(int cpu)
 #endif /* CONFIG_IRQ_TIME_ACCOUNTING */
 
 #ifdef CONFIG_CPU_FREQ
-DECLARE_PER_CPU(struct update_util_data *, cpufreq_update_util_data);
+DECLARE_PER_CPU(struct update_util_data __rcu *, cpufreq_update_util_data);
 
 /**
  * cpufreq_update_util - Take a note about CPU utilization changes.
-- 
2.21.0.rc0.258.g878e2cd30e-goog



[PATCH v2 6/6] sched: Annotate perf_domain pointer with __rcu

2019-02-22 Thread Joel Fernandes (Google)
This fixes the following sparse errors in sched/fair.c:

fair.c:6506:14: error: incompatible types in comparison expression
fair.c:8642:21: error: incompatible types in comparison expression

Using __rcu will also help sparse catch any future bugs.

Signed-off-by: Joel Fernandes (Google) 
---
 kernel/sched/sched.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index ca6a79f57e7a..c8e6514433a9 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -780,7 +780,7 @@ struct root_domain {
 * NULL-terminated list of performance domains intersecting with the
 * CPUs of the rd. Protected by RCU.
 */
-   struct perf_domain  *pd;
+   struct perf_domain __rcu *pd;
 };
 
 extern struct root_domain def_root_domain;
-- 
2.21.0.rc0.258.g878e2cd30e-goog



[PATCH v2 0/6] RCU fixes for rcu_assign_pointer() usage

2019-02-22 Thread Joel Fernandes (Google)
These patches fix various sparse errors found as a result of the recent check
to add rcu_check_sparse() to rcu_assign_pointer().  The errors in some cases
seem to either missing API usage, or missing annotations. The annotations added
in the series can also help avoid future incorrect usages and bugs so it is a
good idea to do in any case.

RFC v1 -> Patch v2:
Made changes to various scheduler patches (Peter Zijlstra)

Joel Fernandes (Google) (6):
net: rtnetlink: Fix incorrect RCU API usage
ixgbe: Fix incorrect RCU API usage
sched/cpufreq: Annotate cpufreq_update_util_data pointer with __rcu
sched_domain: Annotate RCU pointers properly
rcuwait: Annotate task_struct with __rcu
sched: Annotate perf_domain pointer with __rcu

drivers/net/ethernet/intel/ixgbe/ixgbe.h  |  4 ++--
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 15 ++-
include/linux/rcuwait.h   |  2 +-
include/linux/sched/topology.h|  4 ++--
kernel/sched/cpufreq.c|  2 +-
kernel/sched/sched.h  | 18 +-
kernel/sched/topology.c   | 10 +-
net/core/rtnetlink.c  |  4 ++--
8 files changed, 32 insertions(+), 27 deletions(-)

--
2.21.0.rc0.258.g878e2cd30e-goog



[PATCH v2 2/6] ixgbe: Fix incorrect RCU API usage

2019-02-22 Thread Joel Fernandes (Google)
Recently, I added an RCU annotation check in rcu_assign_pointer. This
caused a sparse error to be reported by the ixgbe driver.

Further looking, it seems the adapter->xdp_prog pointer is not annotated
with __rcu. Annonating it fixed the error, but caused a bunch of other
warnings.

This patch tries to fix all warnings by using RCU API properly. This
makes sense to do because not using RCU properly can result in various
hard to find bugs. This is a best effort fix and is only build tested.
The sparse errors and warnings go away with the change. I request
maintainers / developers in this area to review / test it properly.

The sparse error fixed is:
ixgbe_main.c:10256:25: error: incompatible types in comparison expression

Signed-off-by: Joel Fernandes (Google) 
---
 drivers/net/ethernet/intel/ixgbe/ixgbe.h  |  4 ++--
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 15 ++-
 2 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h 
b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index 08d85e336bd4..3b14daf27516 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -311,7 +311,7 @@ struct ixgbe_ring {
struct ixgbe_ring *next;/* pointer to next ring in q_vector */
struct ixgbe_q_vector *q_vector; /* backpointer to host q_vector */
struct net_device *netdev;  /* netdev ring belongs to */
-   struct bpf_prog *xdp_prog;
+   struct bpf_prog __rcu *xdp_prog;
struct device *dev; /* device for DMA mapping */
void *desc; /* descriptor ring memory */
union {
@@ -560,7 +560,7 @@ struct ixgbe_adapter {
unsigned long active_vlans[BITS_TO_LONGS(VLAN_N_VID)];
/* OS defined structs */
struct net_device *netdev;
-   struct bpf_prog *xdp_prog;
+   struct bpf_prog __rcu *xdp_prog;
struct pci_dev *pdev;
struct mii_bus *mii_bus;
 
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c 
b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index daff8183534b..408a312aa6ba 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -2199,7 +2199,7 @@ static struct sk_buff *ixgbe_run_xdp(struct ixgbe_adapter 
*adapter,
u32 act;
 
rcu_read_lock();
-   xdp_prog = READ_ONCE(rx_ring->xdp_prog);
+   xdp_prog = rcu_dereference(rx_ring->xdp_prog);
 
if (!xdp_prog)
goto xdp_out;
@@ -6547,7 +6547,7 @@ int ixgbe_setup_rx_resources(struct ixgbe_adapter 
*adapter,
 rx_ring->queue_index) < 0)
goto err;
 
-   rx_ring->xdp_prog = adapter->xdp_prog;
+   rcu_assign_pointer(rx_ring->xdp_prog, adapter->xdp_prog);
 
return 0;
 err:
@@ -10246,7 +10246,8 @@ static int ixgbe_xdp_setup(struct net_device *dev, 
struct bpf_prog *prog)
if (nr_cpu_ids > MAX_XDP_QUEUES)
return -ENOMEM;
 
-   old_prog = xchg(&adapter->xdp_prog, prog);
+   old_prog = rcu_access_pointer(adapter->xdp_prog);
+   rcu_assign_pointer(adapter->xdp_prog, prog);
 
/* If transitioning XDP modes reconfigure rings */
if (!!prog != !!old_prog) {
@@ -10271,13 +10272,17 @@ static int ixgbe_xdp_setup(struct net_device *dev, 
struct bpf_prog *prog)
 static int ixgbe_xdp(struct net_device *dev, struct netdev_bpf *xdp)
 {
struct ixgbe_adapter *adapter = netdev_priv(dev);
+   struct bpf_prog *prog;
 
switch (xdp->command) {
case XDP_SETUP_PROG:
return ixgbe_xdp_setup(dev, xdp->prog);
case XDP_QUERY_PROG:
-   xdp->prog_id = adapter->xdp_prog ?
-   adapter->xdp_prog->aux->id : 0;
+   rcu_read_lock();
+   prog = rcu_dereference(adapter->xdp_prog);
+   xdp->prog_id = prog ? prog->aux->id : 0;
+   rcu_read_unlock();
+
return 0;
case XDP_QUERY_XSK_UMEM:
return ixgbe_xsk_umem_query(adapter, &xdp->xsk.umem,
-- 
2.21.0.rc0.258.g878e2cd30e-goog



[PATCH v2 1/6] net: rtnetlink: Fix incorrect RCU API usage

2019-02-22 Thread Joel Fernandes (Google)
rtnl_register_internal() and rtnl_unregister_all tries to directly
dereference an RCU protected pointed outside RCU read side section.
While this is Ok to do since a lock is held, let us use the correct
API to avoid programmer bugs in the future.

This also fixes sparse warnings arising from not using RCU API.

net/core/rtnetlink.c:332:13: warning: incorrect type in assignment
(different address spaces) net/core/rtnetlink.c:332:13:expected
struct rtnl_link **tab net/core/rtnetlink.c:332:13:got struct
rtnl_link *[noderef] *

Signed-off-by: Joel Fernandes (Google) 
---
 net/core/rtnetlink.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 5ea1bed08ede..98be4b4818a9 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -188,7 +188,7 @@ static int rtnl_register_internal(struct module *owner,
msgindex = rtm_msgindex(msgtype);
 
rtnl_lock();
-   tab = rtnl_msg_handlers[protocol];
+   tab = rtnl_dereference(rtnl_msg_handlers[protocol]);
if (tab == NULL) {
tab = kcalloc(RTM_NR_MSGTYPES, sizeof(void *), GFP_KERNEL);
if (!tab)
@@ -329,7 +329,7 @@ void rtnl_unregister_all(int protocol)
BUG_ON(protocol < 0 || protocol > RTNL_FAMILY_MAX);
 
rtnl_lock();
-   tab = rtnl_msg_handlers[protocol];
+   tab = rtnl_dereference(rtnl_msg_handlers[protocol]);
if (!tab) {
rtnl_unlock();
return;
-- 
2.21.0.rc0.258.g878e2cd30e-goog



Re: [PATCH RFC 4/5] sched/topology: Annonate RCU pointers properly

2019-02-21 Thread Joel Fernandes
On Thu, Feb 21, 2019 at 04:29:44PM +0100, Peter Zijlstra wrote:
> On Thu, Feb 21, 2019 at 10:10:57AM -0500, Joel Fernandes wrote:
> > Hi Peter,
> > 
> > Thanks for taking a look.
> > 
> > On Thu, Feb 21, 2019 at 10:19:44AM +0100, Peter Zijlstra wrote:
> > > On Thu, Feb 21, 2019 at 12:49:41AM -0500, Joel Fernandes (Google) wrote:
> > > 
> > > > Also replace rcu_assign_pointer call on rq->sd with WRITE_ONCE. This
> > > > should be sufficient for the rq->sd initialization.
> > > 
> > > > @@ -668,7 +668,7 @@ cpu_attach_domain(struct sched_domain *sd, struct 
> > > > root_domain *rd, int cpu)
> > > >  
> > > > rq_attach_root(rq, rd);
> > > > tmp = rq->sd;
> > > > -   rcu_assign_pointer(rq->sd, sd);
> > > > +   WRITE_ONCE(rq->sd, sd);
> > > > dirty_sched_domain_sysctl(cpu);
> > > > destroy_sched_domains(tmp);
> > > 
> > > Where did the RELEASE barrier go?
> > > 
> > > That was a publish operation, now it is not.
> > 
> > Funny thing is, initially I had written this patch with smp_store_release()
> > instead of WRITE_ONCE, but checkpatch complaints with that since it needs a
> > comment on top of it, and I wasn't sure if RELEASE barrier was the intent of
> > using rcu_assign_pointer (all the more reason to replace it with something
> > more explicit).
> > 
> > I will replace it with the following and resubmit it then:
> > 
> > /* Release barrier */
> > smp_store_release(&rq->sd, sd);
> > 
> > Or do we want to just drop the "Release barrier" comment and live with the
> > checkpatch warning?
> 
> How about we keep using rcu_assign_pointer(), the whole sched domain
> tree is under rcu; peruse that destroy_sched_domains() function for
> instance.
> 
> Also check how for_each_domain() uses rcu_dereference().

May be then, all those pointers should be made __rcu as well. Then we can use
rcu_assign_pointer() here. I will look more into it and study these functions
as you are suggesting.

thanks,

- Joel



Re: [PATCH RFC 3/5] sched/cpufreq: Fix incorrect RCU API usage

2019-02-21 Thread Joel Fernandes
On Thu, Feb 21, 2019 at 05:11:44PM +0100, Peter Zijlstra wrote:
> On Thu, Feb 21, 2019 at 07:52:18AM -0800, Paul E. McKenney wrote:
> > On Thu, Feb 21, 2019 at 04:31:17PM +0100, Peter Zijlstra wrote:
> > > On Thu, Feb 21, 2019 at 10:21:39AM -0500, Joel Fernandes wrote:
> > > > On Thu, Feb 21, 2019 at 10:18:05AM +0100, Peter Zijlstra wrote:
> > > > > On Thu, Feb 21, 2019 at 12:49:40AM -0500, Joel Fernandes (Google) 
> > > > > wrote:
> > > > > > @@ -34,8 +34,12 @@ void cpufreq_add_update_util_hook(int cpu, 
> > > > > > struct update_util_data *data,
> > > > > > if (WARN_ON(!data || !func))
> > > > > > return;
> > > > > >  
> > > > > > -   if (WARN_ON(per_cpu(cpufreq_update_util_data, cpu)))
> > > > > > +   rcu_read_lock();
> > > > > > +   if (WARN_ON(rcu_dereference(per_cpu(cpufreq_update_util_data, 
> > > > > > cpu {
> > > > > > +   rcu_read_unlock();
> > > > > > return;
> > > > > > +   }
> > > > > > +   rcu_read_unlock();
> > > > > >  
> > > > > > data->func = func;
> > > > > > rcu_assign_pointer(per_cpu(cpufreq_update_util_data, cpu), 
> > > > > > data);
> 
> > For whatever it is worth, in that case it could use rcu_access_pointer().
> > And this primitive does not do the lockdep check for being within an RCU
> > read-side critical section.  As Peter says, if there is no dereferencing,
> > there can be no use-after-free bug, so the RCU read-side critical is
> > not needed.
> 
> On top of that, I suspect this is under the write-side lock (we're doing
> assignment after all).

Yes it is under a policy->rwsem, just confirmed that.

And indeed rcu_read_lock() is not needed here in this patch, thanks for
pointing that out. Sometimes the word "dereference" plays some visual tricks
and in this case tempted me to add an RCU reader section ;-) Assuming you
guys are in agreement with usage of rcu_access_pointer() here to keep sparse
happy, I'll rework the patch accordingly and resubmit with that.

thanks!

 - Joel





Re: [PATCH v2 1/2] Provide in-kernel headers for making it easy to extend the kernel

2019-02-21 Thread Joel Fernandes
On Thu, Feb 21, 2019 at 11:34:41PM +0900, Masahiro Yamada wrote:
> On Wed, Feb 20, 2019 at 12:17 AM Joel Fernandes  
> wrote:
> 
> >
> > Firstly, I want to apologize for not testing this and other corner cases you
> > brought up. I should have known better. Since my build was working, I 
> > assumed
> > that the feature is working. For that, I am very sorry.
> 
> 
> You do not need to apologize. 0day bot usually catches build errors.
> I guess 0day bot performs compile-tests only incrementally
> and that is why we did not get any report.

Oh ok :) thanks.

> > Secondly, it turns out Module.symvers circularly dependency problem also
> > exists with another use case.
> > If one does 'make modules_prepare' in a base kernel tree and then tries to
> > build modules with that tree, a warning like this is printed but the module
> > still gets built:
> >
> >   WARNING: Symbol version dump ./Module.symvers
> >is missing; modules will have no dependencies and modversions.
> >
> >   CC [M]  /tmp/testmod/test.o
> >   Building modules, stage 2.
> >   MODPOST 1 modules
> >   CC  /tmp/testmod/test.mod.o
> >   LD [M]  /tmp/testmod/test.ko
> >
> > So, I am thinking that at least for first pass I will just drop the 
> > inclusion
> > of Module.symvers in the archive and allow any modules built using
> > /proc/kheaders.tar.xz to not use it.
> >
> > Kbuild will print a warning anyway when anyone tries to build using
> > /proc/kheaders.tar.xz, so if the user really wants module symbol versioning
> > then they should probably use a full kernel source tree with Module.symvers
> > available. For our usecase, kernel symbol versioning is a bit useless when
> > using /proc/kheaders.tar.gz because the proc file is generated with the same
> > kernel that the module is being built against, and subsequently loaded into
> > the kernel. So it is not likely that the CRC of a kernel symbol will be
> > different from what the module expects.
> 
> 
> Without Module.symver, modpost cannot check whether references are
> resolvable or not.
> 
> You will see "WARNING ... undefined" for every symbol referenced from
> the module.
> 
> 
> I am not an Android developer.
> So, I will leave this judge to other people.

IMO I don't see a way around this limiation but it would be nice if there was
a way to make it work. Since the kernel modules being built by this mechanism
are for tracing/debugging purposes, it is not a major concern for us.


> One more request if you have a chance to submit the next version.
> Please do not hide error messages.

Actually it was intended to suppress noise, not hide errors as such. I have
fixed all the errors in the next version and will be submitting it soon.

Thanks a lot for the review!

 - Joel


> I wondered why you redirected stdout/stderr from the script.
> 
> I applied the following patch, and I tested.  Then I see why.
> 
> Please fix your code instead of hiding underlying problems.
> 
> 
> diff --git a/kernel/Makefile b/kernel/Makefile
> index 1d13a7a..a76ccbd 100644
> --- a/kernel/Makefile
> +++ b/kernel/Makefile
> @@ -148,7 +148,7 @@ $(obj)/kheaders.o: $(obj)/kheaders_data.h
>  targets += kheaders_data.txz
> 
>  quiet_cmd_genikh = GEN $(obj)/kheaders_data.txz
> -cmd_genikh = $(srctree)/scripts/gen_ikh_data.sh $@ $^ >/dev/null 2>&1
> +cmd_genikh = $(srctree)/scripts/gen_ikh_data.sh $@ $^
>  $(obj)/kheaders_data.txz: $(ikh_file_list) FORCE
> $(call cmd,genikh)
> 
> 
> 
> 
> 
> 
> masahiro@grover:~/workspace/linux-yamada$ make
>   CALLscripts/checksyscalls.sh
>   DESCEND  objtool
>   CHK include/generated/compile.h
>   GEN kernel/kheaders_data.txz
> find: ‘FORCE’: No such file or directory
> 70106 blocks
> Can't do inplace edit: kernel/kheaders_data.txz.tmp is not a regular file.
> Can't do inplace edit: kernel/kheaders_data.txz.tmp/arch is not a regular 
> file.
> Can't do inplace edit: kernel/kheaders_data.txz.tmp/arch/x86 is not a
> regular file.
> Can't do inplace edit: kernel/kheaders_data.txz.tmp/arch/x86/include
> is not a regular file.
> Can't do inplace edit:
> kernel/kheaders_data.txz.tmp/arch/x86/include/uapi is not a regular
> file.
> Can't do inplace edit:
> kernel/kheaders_data.txz.tmp/arch/x86/include/uapi/asm is not a
> regular file.
> Can't do inplace edit:
> kernel/kheaders_data.txz.tmp/arch/x86/include/generated is not a
> regular file.
> Can't do inplace edit:
> kernel/kheaders_data.txz.tmp/arch/x86/include/generated/uapi is not a
> regular file.
> Can't do inpla

Re: [PATCH RFC 3/5] sched/cpufreq: Fix incorrect RCU API usage

2019-02-21 Thread Joel Fernandes
On Thu, Feb 21, 2019 at 10:18:05AM +0100, Peter Zijlstra wrote:
> On Thu, Feb 21, 2019 at 12:49:40AM -0500, Joel Fernandes (Google) wrote:
> > @@ -34,8 +34,12 @@ void cpufreq_add_update_util_hook(int cpu, struct 
> > update_util_data *data,
> > if (WARN_ON(!data || !func))
> > return;
> >  
> > -   if (WARN_ON(per_cpu(cpufreq_update_util_data, cpu)))
> > +   rcu_read_lock();
> > +   if (WARN_ON(rcu_dereference(per_cpu(cpufreq_update_util_data, cpu {
> > +   rcu_read_unlock();
> > return;
> > +   }
> > +   rcu_read_unlock();
> >  
> > data->func = func;
> > rcu_assign_pointer(per_cpu(cpufreq_update_util_data, cpu), data);
> 
> This doesn't make any kind of sense to me.
> 

As per the rcu_assign_pointer() line, I inferred that
cpufreq_update_util_data is expected to be RCU protected. Reading the pointer
value of RCU pointers generally needs to be done from RCU read section, and
using rcu_dereference() (or using rcu_access()).

In this patch, I changed cpufreq_update_util_data to be __rcu annotated to
avoid the sparse error thrown by rcu_assign_pointer().

Instead of doing that, If your intention here is RELEASE barrier, should I
just replace in this function:
rcu_assign_pointer(per_cpu(cpufreq_update_util_data, cpu), data);
with:
smp_store_release(per_cpu(cpufreq_update_util_data, cpu), data))
?

It would be nice IMO to be explicit about the intention of release/publish
semantics by using smp_store_release().

thanks,

 - Joel



Re: [PATCH RFC 4/5] sched/topology: Annonate RCU pointers properly

2019-02-21 Thread Joel Fernandes
Hi Peter,

Thanks for taking a look.

On Thu, Feb 21, 2019 at 10:19:44AM +0100, Peter Zijlstra wrote:
> On Thu, Feb 21, 2019 at 12:49:41AM -0500, Joel Fernandes (Google) wrote:
> 
> > Also replace rcu_assign_pointer call on rq->sd with WRITE_ONCE. This
> > should be sufficient for the rq->sd initialization.
> 
> > @@ -668,7 +668,7 @@ cpu_attach_domain(struct sched_domain *sd, struct 
> > root_domain *rd, int cpu)
> >  
> > rq_attach_root(rq, rd);
> > tmp = rq->sd;
> > -   rcu_assign_pointer(rq->sd, sd);
> > +   WRITE_ONCE(rq->sd, sd);
> > dirty_sched_domain_sysctl(cpu);
> > destroy_sched_domains(tmp);
> 
> Where did the RELEASE barrier go?
> 
> That was a publish operation, now it is not.

Funny thing is, initially I had written this patch with smp_store_release()
instead of WRITE_ONCE, but checkpatch complaints with that since it needs a
comment on top of it, and I wasn't sure if RELEASE barrier was the intent of
using rcu_assign_pointer (all the more reason to replace it with something
more explicit).

I will replace it with the following and resubmit it then:

/* Release barrier */
smp_store_release(&rq->sd, sd);

Or do we want to just drop the "Release barrier" comment and live with the
checkpatch warning?

(my same response applies to patch 5/5).

thanks,

 - Joel



[PATCH RFC 2/5] ixgbe: Fix incorrect RCU API usage

2019-02-20 Thread Joel Fernandes (Google)
Recently, I added an RCU annotation check in rcu_assign_pointer. This
caused a sparse error to be reported by the ixgbe driver.

Further looking, it seems the adapter->xdp_prog pointer is not annotated
with __rcu. Annonating it fixed the error, but caused a bunch of other
warnings.

This patch tries to fix all warnings by using RCU API properly. This
makes sense to do because not using RCU properly can result in various
hard to find bugs. This is a best effort fix and is only build tested.
The sparse errors and warnings go away with the change. I request
maintainers / developers in this area to test it properly.

Signed-off-by: Joel Fernandes (Google) 
---
 drivers/net/ethernet/intel/ixgbe/ixgbe.h  |  4 ++--
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 17 -
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h 
b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index 08d85e336bd4..3b14daf27516 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -311,7 +311,7 @@ struct ixgbe_ring {
struct ixgbe_ring *next;/* pointer to next ring in q_vector */
struct ixgbe_q_vector *q_vector; /* backpointer to host q_vector */
struct net_device *netdev;  /* netdev ring belongs to */
-   struct bpf_prog *xdp_prog;
+   struct bpf_prog __rcu *xdp_prog;
struct device *dev; /* device for DMA mapping */
void *desc; /* descriptor ring memory */
union {
@@ -560,7 +560,7 @@ struct ixgbe_adapter {
unsigned long active_vlans[BITS_TO_LONGS(VLAN_N_VID)];
/* OS defined structs */
struct net_device *netdev;
-   struct bpf_prog *xdp_prog;
+   struct bpf_prog __rcu *xdp_prog;
struct pci_dev *pdev;
struct mii_bus *mii_bus;
 
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c 
b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index daff8183534b..6aa59bb13a14 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -2199,7 +2199,7 @@ static struct sk_buff *ixgbe_run_xdp(struct ixgbe_adapter 
*adapter,
u32 act;
 
rcu_read_lock();
-   xdp_prog = READ_ONCE(rx_ring->xdp_prog);
+   xdp_prog = rcu_dereference(rx_ring->xdp_prog);
 
if (!xdp_prog)
goto xdp_out;
@@ -6547,7 +6547,7 @@ int ixgbe_setup_rx_resources(struct ixgbe_adapter 
*adapter,
 rx_ring->queue_index) < 0)
goto err;
 
-   rx_ring->xdp_prog = adapter->xdp_prog;
+   rcu_assign_pointer(rx_ring->xdp_prog, adapter->xdp_prog);
 
return 0;
 err:
@@ -10246,7 +10246,10 @@ static int ixgbe_xdp_setup(struct net_device *dev, 
struct bpf_prog *prog)
if (nr_cpu_ids > MAX_XDP_QUEUES)
return -ENOMEM;
 
-   old_prog = xchg(&adapter->xdp_prog, prog);
+   rcu_read_lock();
+   old_prog = rcu_dereference(adapter->xdp_prog);
+   rcu_assign_pointer(adapter->xdp_prog, prog);
+   rcu_read_unlock();
 
/* If transitioning XDP modes reconfigure rings */
if (!!prog != !!old_prog) {
@@ -10271,13 +10274,17 @@ static int ixgbe_xdp_setup(struct net_device *dev, 
struct bpf_prog *prog)
 static int ixgbe_xdp(struct net_device *dev, struct netdev_bpf *xdp)
 {
struct ixgbe_adapter *adapter = netdev_priv(dev);
+   struct bpf_prog *prog;
 
switch (xdp->command) {
case XDP_SETUP_PROG:
return ixgbe_xdp_setup(dev, xdp->prog);
case XDP_QUERY_PROG:
-   xdp->prog_id = adapter->xdp_prog ?
-   adapter->xdp_prog->aux->id : 0;
+   rcu_read_lock();
+   prog = rcu_dereference(adapter->xdp_prog);
+   xdp->prog_id = prog ? prog->aux->id : 0;
+   rcu_read_unlock();
+
return 0;
case XDP_QUERY_XSK_UMEM:
return ixgbe_xsk_umem_query(adapter, &xdp->xsk.umem,
-- 
2.21.0.rc0.258.g878e2cd30e-goog



[PATCH RFC 4/5] sched/topology: Annonate RCU pointers properly

2019-02-20 Thread Joel Fernandes (Google)
The scheduler's topology code uses rcu_assign_pointer() to initialize
various pointers.

Let us annotate the pointers correctly which also help avoid future
bugs. This suppresses the new sparse errors caused by an annotation
check I added to rcu_assign_pointer().

Also replace rcu_assign_pointer call on rq->sd with WRITE_ONCE. This
should be sufficient for the rq->sd initialization.

This fixes sparse errors:
kernel//sched/topology.c:378:9: sparse: error: incompatible types in
comparison expression (different address spaces)
kernel//sched/topology.c:387:9: sparse: error: incompatible types in
comparison expression (different address spaces)
kernel//sched/topology.c:612:9: sparse: error: incompatible types in
comparison expression (different address spaces)
kernel//sched/topology.c:615:9: sparse: error: incompatible types in
comparison expression (different address spaces)
kernel//sched/topology.c:618:9: sparse: error: incompatible types in
comparison expression (different address spaces)
kernel//sched/topology.c:621:9: sparse: error: incompatible types in
comparison expression (different address spaces)

Signed-off-by: Joel Fernandes (Google) 
---
 kernel/sched/sched.h| 12 ++--
 kernel/sched/topology.c | 12 ++--
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 2ab545d40381..806703afd4b0 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -780,7 +780,7 @@ struct root_domain {
 * NULL-terminated list of performance domains intersecting with the
 * CPUs of the rd. Protected by RCU.
 */
-   struct perf_domain  *pd;
+   struct perf_domain __rcu *pd;
 };
 
 extern struct root_domain def_root_domain;
@@ -1305,13 +1305,13 @@ static inline struct sched_domain 
*lowest_flag_domain(int cpu, int flag)
return sd;
 }
 
-DECLARE_PER_CPU(struct sched_domain *, sd_llc);
+DECLARE_PER_CPU(struct sched_domain __rcu *, sd_llc);
 DECLARE_PER_CPU(int, sd_llc_size);
 DECLARE_PER_CPU(int, sd_llc_id);
-DECLARE_PER_CPU(struct sched_domain_shared *, sd_llc_shared);
-DECLARE_PER_CPU(struct sched_domain *, sd_numa);
-DECLARE_PER_CPU(struct sched_domain *, sd_asym_packing);
-DECLARE_PER_CPU(struct sched_domain *, sd_asym_cpucapacity);
+DECLARE_PER_CPU(struct sched_domain_shared __rcu *, sd_llc_shared);
+DECLARE_PER_CPU(struct sched_domain __rcu *, sd_numa);
+DECLARE_PER_CPU(struct sched_domain __rcu *, sd_asym_packing);
+DECLARE_PER_CPU(struct sched_domain __rcu *, sd_asym_cpucapacity);
 extern struct static_key_false sched_asym_cpucapacity;
 
 struct sched_group_capacity {
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 3f35ba1d8fde..2eab2e16ded5 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -586,13 +586,13 @@ static void destroy_sched_domains(struct sched_domain *sd)
  * the cpumask of the domain), this allows us to quickly tell if
  * two CPUs are in the same cache domain, see cpus_share_cache().
  */
-DEFINE_PER_CPU(struct sched_domain *, sd_llc);
+DEFINE_PER_CPU(struct sched_domain __rcu *, sd_llc);
 DEFINE_PER_CPU(int, sd_llc_size);
 DEFINE_PER_CPU(int, sd_llc_id);
-DEFINE_PER_CPU(struct sched_domain_shared *, sd_llc_shared);
-DEFINE_PER_CPU(struct sched_domain *, sd_numa);
-DEFINE_PER_CPU(struct sched_domain *, sd_asym_packing);
-DEFINE_PER_CPU(struct sched_domain *, sd_asym_cpucapacity);
+DEFINE_PER_CPU(struct sched_domain_shared __rcu *, sd_llc_shared);
+DEFINE_PER_CPU(struct sched_domain __rcu *, sd_numa);
+DEFINE_PER_CPU(struct sched_domain __rcu *, sd_asym_packing);
+DEFINE_PER_CPU(struct sched_domain __rcu *, sd_asym_cpucapacity);
 DEFINE_STATIC_KEY_FALSE(sched_asym_cpucapacity);
 
 static void update_top_cache_domain(int cpu)
@@ -668,7 +668,7 @@ cpu_attach_domain(struct sched_domain *sd, struct 
root_domain *rd, int cpu)
 
rq_attach_root(rq, rd);
tmp = rq->sd;
-   rcu_assign_pointer(rq->sd, sd);
+   WRITE_ONCE(rq->sd, sd);
dirty_sched_domain_sysctl(cpu);
destroy_sched_domains(tmp);
 
-- 
2.21.0.rc0.258.g878e2cd30e-goog



[PATCH RFC 0/5] RCU fixes for rcu_assign_pointer() usage

2019-02-20 Thread Joel Fernandes (Google)
These patches fix various RCU API usage issues found due to sparse errors as a
result of the recent check to add rcu_check_sparse() to rcu_assign_pointer().
The errors in many cases seem to indicate either an incorrect API usage, or
missing annotations. The annotations added can also help avoid future incorrect
usages and bugs so it is a good idea to do in any case.

These are only build/boot tested and I request for feedback from maintainers
and developers in the various areas the patches touch. Thanks for any feedback!

(There are still errors in rbtree.h but I have kept those for a later time
since fixing them is a bit more involved).

Joel Fernandes (Google) (5):
net: rtnetlink: Fix incorrect RCU API usage
ixgbe: Fix incorrect RCU API usage
sched/cpufreq: Fix incorrect RCU API usage
sched/topology: Annonate RCU pointers properly
rcuwait: Replace rcu_assign_pointer() with WRITE_ONCE

drivers/net/ethernet/intel/ixgbe/ixgbe.h  |  4 ++--
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 17 -
include/linux/rcuwait.h   |  2 +-
kernel/sched/cpufreq.c|  8 ++--
kernel/sched/sched.h  | 14 +++---
kernel/sched/topology.c   | 12 ++--
net/core/rtnetlink.c  |  4 ++--
7 files changed, 36 insertions(+), 25 deletions(-)

--
2.21.0.rc0.258.g878e2cd30e-goog



[PATCH RFC 3/5] sched/cpufreq: Fix incorrect RCU API usage

2019-02-20 Thread Joel Fernandes (Google)
Recently I added an RCU annotation check to rcu_assign_pointer(). All
pointers assigned to RCU protected data are to be annotated with __rcu
inorder to be able to use rcu_assign_pointer() similar to checks in
other RCU APIs.

This resulted in a sparse error: kernel//sched/cpufreq.c:41:9: sparse:
error: incompatible types in comparison expression (different address
spaces)

Fix this by using the correct APIs for RCU accesses. This will
potentially avoid any future bugs in the code. If it is felt that RCU
protection is not needed here, then the rcu_assign_pointer call can be
dropped and replaced with, say, WRITE_ONCE or smp_store_release. Or, may
be we add a new API to do it. But calls rcu_assign_pointer seems an
abuse of the RCU API unless RCU is being used.

Signed-off-by: Joel Fernandes (Google) 
---
 kernel/sched/cpufreq.c | 8 ++--
 kernel/sched/sched.h   | 2 +-
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/cpufreq.c b/kernel/sched/cpufreq.c
index 22bd8980f32f..c9aeb3bf5dc2 100644
--- a/kernel/sched/cpufreq.c
+++ b/kernel/sched/cpufreq.c
@@ -7,7 +7,7 @@
  */
 #include "sched.h"
 
-DEFINE_PER_CPU(struct update_util_data *, cpufreq_update_util_data);
+DEFINE_PER_CPU(struct update_util_data __rcu *, cpufreq_update_util_data);
 
 /**
  * cpufreq_add_update_util_hook - Populate the CPU's update_util_data pointer.
@@ -34,8 +34,12 @@ void cpufreq_add_update_util_hook(int cpu, struct 
update_util_data *data,
if (WARN_ON(!data || !func))
return;
 
-   if (WARN_ON(per_cpu(cpufreq_update_util_data, cpu)))
+   rcu_read_lock();
+   if (WARN_ON(rcu_dereference(per_cpu(cpufreq_update_util_data, cpu {
+   rcu_read_unlock();
return;
+   }
+   rcu_read_unlock();
 
data->func = func;
rcu_assign_pointer(per_cpu(cpufreq_update_util_data, cpu), data);
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index d04530bf251f..2ab545d40381 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -2166,7 +2166,7 @@ static inline u64 irq_time_read(int cpu)
 #endif /* CONFIG_IRQ_TIME_ACCOUNTING */
 
 #ifdef CONFIG_CPU_FREQ
-DECLARE_PER_CPU(struct update_util_data *, cpufreq_update_util_data);
+DECLARE_PER_CPU(struct update_util_data __rcu *, cpufreq_update_util_data);
 
 /**
  * cpufreq_update_util - Take a note about CPU utilization changes.
-- 
2.21.0.rc0.258.g878e2cd30e-goog



[PATCH RFC 5/5] rcuwait: Replace rcu_assign_pointer() with WRITE_ONCE

2019-02-20 Thread Joel Fernandes (Google)
This suppresses a sparse error generated due to the recently added
rcu_assign_pointer sparse check below. It seems WRITE_ONCE should be
sufficient here.

>> kernel//locking/percpu-rwsem.c:162:9: sparse: error: incompatible
types in comparison expression (different address spaces)

Signed-off-by: Joel Fernandes (Google) 
---
 include/linux/rcuwait.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/rcuwait.h b/include/linux/rcuwait.h
index 90bfa3279a01..9e5b4760e6c2 100644
--- a/include/linux/rcuwait.h
+++ b/include/linux/rcuwait.h
@@ -44,7 +44,7 @@ extern void rcuwait_wake_up(struct rcuwait *w);
 */ \
WARN_ON(current->exit_state);   \
\
-   rcu_assign_pointer((w)->task, current); \
+   WRITE_ONCE((w)->task, current); \
for (;;) {  \
/*  \
 * Implicit barrier (A) pairs with (B) in   \
-- 
2.21.0.rc0.258.g878e2cd30e-goog



[PATCH RFC 1/5] net: rtnetlink: Fix incorrect RCU API usage

2019-02-20 Thread Joel Fernandes (Google)
rtnl_register_internal() and rtnl_unregister_all tries to directly
dereference an RCU protected pointed outside RCU read side section.
While this is Ok to do since a lock is held, let us use the correct
API to avoid programmer bugs in the future.

This also fixes sparse warnings arising from not using RCU API.

net/core/rtnetlink.c:332:13: warning: incorrect type in assignment
(different address spaces) net/core/rtnetlink.c:332:13:expected
struct rtnl_link **tab net/core/rtnetlink.c:332:13:got struct
rtnl_link *[noderef] *

Signed-off-by: Joel Fernandes (Google) 
---
 net/core/rtnetlink.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 5ea1bed08ede..98be4b4818a9 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -188,7 +188,7 @@ static int rtnl_register_internal(struct module *owner,
msgindex = rtm_msgindex(msgtype);
 
rtnl_lock();
-   tab = rtnl_msg_handlers[protocol];
+   tab = rtnl_dereference(rtnl_msg_handlers[protocol]);
if (tab == NULL) {
tab = kcalloc(RTM_NR_MSGTYPES, sizeof(void *), GFP_KERNEL);
if (!tab)
@@ -329,7 +329,7 @@ void rtnl_unregister_all(int protocol)
BUG_ON(protocol < 0 || protocol > RTNL_FAMILY_MAX);
 
rtnl_lock();
-   tab = rtnl_msg_handlers[protocol];
+   tab = rtnl_dereference(rtnl_msg_handlers[protocol]);
if (!tab) {
rtnl_unlock();
return;
-- 
2.21.0.rc0.258.g878e2cd30e-goog



Re: [PATCH v2 1/2] Provide in-kernel headers for making it easy to extend the kernel

2019-02-19 Thread Joel Fernandes
On Tue, Feb 19, 2019 at 01:42:13PM +0900, Masahiro Yamada wrote:
> On Tue, Feb 19, 2019 at 1:14 PM Masahiro Yamada
>  wrote:
> >
> > On Fri, Feb 15, 2019 at 11:48 PM Alexei Starovoitov
> >  wrote:
> > >
> > > On Mon, Feb 11, 2019 at 09:35:59AM -0500, Joel Fernandes (Google) wrote:
> > > > Introduce in-kernel headers and other artifacts which are made available
> > > > as an archive through proc (/proc/kheaders.txz file).
> >
> >
> > The extension '.txz' is not used in kernel code.
> >
> >
> > '.tar.xz' is used for 'tarxz-pkg', 'perf-tarxz-src-pkg' etc.
> >
> >
> > $ git grep  '\.txz'
> > $ git grep  '\.tar\.xz'
> > Documentation/admin-guide/README.rst: xz -cd linux-4.X.tar.xz | tar xvf 
> > -
> > arch/x86/crypto/camellia-aesni-avx-asm_64.S: *
> > http://koti.mbnet.fi/axh/crypto/camellia-BSD-1.2.0-aesni1.tar.xz
> > crypto/testmgr.h: *   
> > https://bench.cr.yp.to/supercop/supercop-20170228.tar.xz
> > crypto/testmgr.h: *   
> > https://bench.cr.yp.to/supercop/supercop-20170228.tar.xz
> > crypto/testmgr.h: *   
> > https://bench.cr.yp.to/supercop/supercop-20170228.tar.xz
> > crypto/testmgr.h: *   
> > https://bench.cr.yp.to/supercop/supercop-20170228.tar.xz
> > crypto/testmgr.h: *   
> > https://bench.cr.yp.to/supercop/supercop-20170228.tar.xz
> > scripts/package/Makefile:   @echo '  perf-tarxz-src-pkg  - Build
> > $(perf-tar).tar.xz source tarball'
> > tools/testing/selftests/gen_kselftest_tar.sh:
> >  ext=".tar.xz"
> >
> >
> >
> > I prefer '.tar.xz' for consistency.
> >
> >
> >
> >
> >
> >
> > BTW, have you ever looked at scripts/extract-ikconfig?
> >
> > You added IKHD_ST and IKHD_ED
> > just to mimic kernel/configs.c
> >
> > It is currently pointless without the extracting tool,
> > but you might think it is useful to extract headers
> > from vmlinux or the module without mounting procfs.
> >
> >
> >
> >
> > > > This archive makes
> > > > it possible to build kernel modules, run eBPF programs, and other
> > > > tracing programs that need to extend the kernel for tracing purposes
> > > > without any dependency on the file system having headers and build
> > > > artifacts.
> > > >
> > > > On Android and embedded systems, it is common to switch kernels but not
> > > > have kernel headers available on the file system. Raw kernel headers
> > > > also cannot be copied into the filesystem like they can be on other
> > > > distros, due to licensing and other issues. There's no linux-headers
> > > > package on Android. Further once a different kernel is booted, any
> > > > headers stored on the file system will no longer be useful. By storing
> > > > the headers as a compressed archive within the kernel, we can avoid 
> > > > these
> > > > issues that have been a hindrance for a long time.
> > >
> > > The set looks good to me and since the main use case is building bpf progs
> > > I can route it via bpf-next tree if there are no objections.
> > > Masahiro, could you please ack it?
> >
> >
> > Honestly, I was not tracking this thread
> > since I did not know I was responsible for this.
> >
> >
> > I just started to take a closer look, then immediately got scared.
> >
> > This version is not mature enough for the merge.
> >
> >
> >
> > First of all, this patch cannot be compiled out-of-tree (O= option).
> >
> >
> > I do not know why 0-day bot did not catch this apparent breakage.
> >
> >
> > $ make -j8 O=hoge
> > make[1]: Entering directory '/home/masahiro/workspace/bsp/linux/hoge'
> >   GEN Makefile
> >   Using .. as source for kernel
> >   DESCEND  objtool
> >   CALL../scripts/checksyscalls.sh
> >   CHK include/generated/compile.h
> > make[2]: *** No rule to make target 'Module.symvers', needed by
> > 'kernel/kheaders_data.txz'.  Stop.
> > make[2]: *** Waiting for unfinished jobs
> > /home/masahiro/workspace/bsp/linux/Makefile:1043: recipe for target
> > 'kernel' failed
> > make[1]: *** [kernel] Error 2
> > make[1]: *** Waiting for unfinished jobs
> > make[1]: Leaving directory '/home/masahiro/workspace/bsp/linux/hoge'
> > Makefile:152: recipe for target 'sub-make&

Re: [PATCH v2 1/2] Provide in-kernel headers for making it easy to extend the kernel

2019-02-18 Thread Joel Fernandes
On Tue, Feb 19, 2019 at 01:42:13PM +0900, Masahiro Yamada wrote:
[..]
> > > > This archive makes
> > > > it possible to build kernel modules, run eBPF programs, and other
> > > > tracing programs that need to extend the kernel for tracing purposes
> > > > without any dependency on the file system having headers and build
> > > > artifacts.
> > > >
> > > > On Android and embedded systems, it is common to switch kernels but not
> > > > have kernel headers available on the file system. Raw kernel headers
> > > > also cannot be copied into the filesystem like they can be on other
> > > > distros, due to licensing and other issues. There's no linux-headers
> > > > package on Android. Further once a different kernel is booted, any
> > > > headers stored on the file system will no longer be useful. By storing
> > > > the headers as a compressed archive within the kernel, we can avoid 
> > > > these
> > > > issues that have been a hindrance for a long time.
> > >
> > > The set looks good to me and since the main use case is building bpf progs
> > > I can route it via bpf-next tree if there are no objections.
> > > Masahiro, could you please ack it?
> >
> >
> > Honestly, I was not tracking this thread
> > since I did not know I was responsible for this.
> >
> >
> > I just started to take a closer look, then immediately got scared.
> >
> > This version is not mature enough for the merge.
> >
> >
> >
> > First of all, this patch cannot be compiled out-of-tree (O= option).
> >
> >
> > I do not know why 0-day bot did not catch this apparent breakage.
> >
> >
> > $ make -j8 O=hoge
> > make[1]: Entering directory '/home/masahiro/workspace/bsp/linux/hoge'
> >   GEN Makefile
> >   Using .. as source for kernel
> >   DESCEND  objtool
> >   CALL../scripts/checksyscalls.sh
> >   CHK include/generated/compile.h
> > make[2]: *** No rule to make target 'Module.symvers', needed by
> > 'kernel/kheaders_data.txz'.  Stop.
> > make[2]: *** Waiting for unfinished jobs
> > /home/masahiro/workspace/bsp/linux/Makefile:1043: recipe for target
> > 'kernel' failed
> > make[1]: *** [kernel] Error 2
> > make[1]: *** Waiting for unfinished jobs
> > make[1]: Leaving directory '/home/masahiro/workspace/bsp/linux/hoge'
> > Makefile:152: recipe for target 'sub-make' failed
> > make: *** [sub-make] Error 2
> 
> 
> 
> 
> I saw this build error for in-tree building as well.
> 
> We cannot build this from a pristine source tree.
> 
> For example, I observed the build error in the following procedure.
> 
> $ make mrproper
> $ make defconfig
>  Set CONFIG_IKHEADERS_PROC=y
> $ make
> 
> 
> 
> 
> Module.symvers is generated in the modpost stage
> (the very last stage of build).
> 
> But, vmlinux depends on kernel/kheaders_data.txz,
> which includes Module.symvers.
> 
> So, this is not so simple since it is a circular dependency...

I guess I was not building a pristine tree either and missed this circular
dependency :-/ . Any ideas on how we can fix the Module.symvers issue? One
idea is to reserve the space in the binaries, but only populate the space
reserved in the binary *after* the modpost stage, once the archive is ready..

thanks,

 - Joel



Re: [PATCH v2 1/2] Provide in-kernel headers for making it easy to extend the kernel

2019-02-18 Thread Joel Fernandes
On Tue, Feb 19, 2019 at 01:14:11PM +0900, Masahiro Yamada wrote:
> On Fri, Feb 15, 2019 at 11:48 PM Alexei Starovoitov
>  wrote:
> >
> > On Mon, Feb 11, 2019 at 09:35:59AM -0500, Joel Fernandes (Google) wrote:
> > > Introduce in-kernel headers and other artifacts which are made available
> > > as an archive through proc (/proc/kheaders.txz file).
> 
> 
> The extension '.txz' is not used in kernel code.
> 
> 
> '.tar.xz' is used for 'tarxz-pkg', 'perf-tarxz-src-pkg' etc.
> 
> 
> $ git grep  '\.txz'
> $ git grep  '\.tar\.xz'
> Documentation/admin-guide/README.rst: xz -cd linux-4.X.tar.xz | tar xvf -
> arch/x86/crypto/camellia-aesni-avx-asm_64.S: *
> http://koti.mbnet.fi/axh/crypto/camellia-BSD-1.2.0-aesni1.tar.xz
> crypto/testmgr.h: *   https://bench.cr.yp.to/supercop/supercop-20170228.tar.xz
> crypto/testmgr.h: *   https://bench.cr.yp.to/supercop/supercop-20170228.tar.xz
> crypto/testmgr.h: *   https://bench.cr.yp.to/supercop/supercop-20170228.tar.xz
> crypto/testmgr.h: *   https://bench.cr.yp.to/supercop/supercop-20170228.tar.xz
> crypto/testmgr.h: *   https://bench.cr.yp.to/supercop/supercop-20170228.tar.xz
> scripts/package/Makefile:   @echo '  perf-tarxz-src-pkg  - Build
> $(perf-tar).tar.xz source tarball'
> tools/testing/selftests/gen_kselftest_tar.sh:
>  ext=".tar.xz"
> 
> 
> 
> I prefer '.tar.xz' for consistency.

Ok, I will change it to that.

> BTW, have you ever looked at scripts/extract-ikconfig?
> 
> You added IKHD_ST and IKHD_ED
> just to mimic kernel/configs.c
> 
> It is currently pointless without the extracting tool,
> but you might think it is useful to extract headers
> from vmlinux or the module without mounting procfs.

I am planing add to extraction support in the future, so I prefer to leave the
markers as it is for now. Hope that's Ok with you.


> > > This archive makes
> > > it possible to build kernel modules, run eBPF programs, and other
> > > tracing programs that need to extend the kernel for tracing purposes
> > > without any dependency on the file system having headers and build
> > > artifacts.
> > >
> > > On Android and embedded systems, it is common to switch kernels but not
> > > have kernel headers available on the file system. Raw kernel headers
> > > also cannot be copied into the filesystem like they can be on other
> > > distros, due to licensing and other issues. There's no linux-headers
> > > package on Android. Further once a different kernel is booted, any
> > > headers stored on the file system will no longer be useful. By storing
> > > the headers as a compressed archive within the kernel, we can avoid these
> > > issues that have been a hindrance for a long time.
> >
> > The set looks good to me and since the main use case is building bpf progs
> > I can route it via bpf-next tree if there are no objections.
> > Masahiro, could you please ack it?
> 
> 
> Honestly, I was not tracking this thread
> since I did not know I was responsible for this.
> 
> 
> I just started to take a closer look, then immediately got scared.
> 
> This version is not mature enough for the merge.
> First of all, this patch cannot be compiled out-of-tree (O= option).

Oh, ok. This is not how I build the kernel. So I am sorry for missing that. I
will try this out-of-tree build option and fix it.


> I do not know why 0-day bot did not catch this apparent breakage.
> 
> 
> $ make -j8 O=hoge
> make[1]: Entering directory '/home/masahiro/workspace/bsp/linux/hoge'
>   GEN Makefile
>   Using .. as source for kernel
>   DESCEND  objtool
>   CALL../scripts/checksyscalls.sh
>   CHK include/generated/compile.h
> make[2]: *** No rule to make target 'Module.symvers', needed by
> 'kernel/kheaders_data.txz'.  Stop.
> make[2]: *** Waiting for unfinished jobs
> /home/masahiro/workspace/bsp/linux/Makefile:1043: recipe for target
> 'kernel' failed
> make[1]: *** [kernel] Error 2
> make[1]: *** Waiting for unfinished jobs
> make[1]: Leaving directory '/home/masahiro/workspace/bsp/linux/hoge'
> Makefile:152: recipe for target 'sub-make' failed
> make: *** [sub-make] Error 2
> 
> 
> 
> 
> 
> I was able to compile it in-tree
> but it makes the incremental build extremely slow.
> 
> (Here, the incremental build means
> "make" without changing any code after the full build.)
> 
> 
> 
> 
> Before this patch, "make -j8" took 11 sec on my machine.
> 
> real 0m11.777s
> user 0m16.608s

Re: [PATCH v2 1/2] Provide in-kernel headers for making it easy to extend the kernel

2019-02-14 Thread Joel Fernandes
On Thu, Feb 14, 2019 at 07:19:29PM -0800, Alexei Starovoitov wrote:
> On Mon, Feb 11, 2019 at 09:35:59AM -0500, Joel Fernandes (Google) wrote:
> > Introduce in-kernel headers and other artifacts which are made available
> > as an archive through proc (/proc/kheaders.txz file). This archive makes
> > it possible to build kernel modules, run eBPF programs, and other
> > tracing programs that need to extend the kernel for tracing purposes
> > without any dependency on the file system having headers and build
> > artifacts.
> > 
> > On Android and embedded systems, it is common to switch kernels but not
> > have kernel headers available on the file system. Raw kernel headers
> > also cannot be copied into the filesystem like they can be on other
> > distros, due to licensing and other issues. There's no linux-headers
> > package on Android. Further once a different kernel is booted, any
> > headers stored on the file system will no longer be useful. By storing
> > the headers as a compressed archive within the kernel, we can avoid these
> > issues that have been a hindrance for a long time.
> 
> The set looks good to me and since the main use case is building bpf progs
> I can route it via bpf-next tree if there are no objections.
> Masahiro, could you please ack it?
> 

Yes, eBPF is one of the usecases. After this, I am also planning to send
patches to BCC so that it can use this feature when compiling C to eBPF.

Thanks!

 - Joel



Re: syzbot rcu/debugobjects warning

2018-03-24 Thread Joel Fernandes
On Fri, Mar 23, 2018 at 1:41 PM, Thomas Gleixner  wrote:
> On Fri, 23 Mar 2018, Joel Fernandes wrote:
>> On Fri, Mar 23, 2018 at 2:11 AM, Thomas Gleixner  wrote:
>> > On Thu, 22 Mar 2018, Joel Fernandes wrote:
>> Sorry. Here is the raw crash log: https://pastebin.com/raw/puvh0cXE
>> (The kernel logs are toward the end with the above).
>
> And that is interesting:
>
> [  150.629667]   [  150.631700]  [] 
> dump_stack+0xc1/0x128
> [  150.637051]  [] ? __debug_object_init+0x526/0xc40
> [  150.643431]  [] panic+0x1bc/0x3a8
> [  150.648416]  [] ? 
> percpu_up_read_preempt_enable.constprop.53+0xd7/0xd7
> [  150.656611]  [] ? load_image_and_restore+0xf9/0xf9
> [  150.663070]  [] ? vprintk_default+0x1d/0x30
> [  150.668925]  [] ? __warn+0x1a9/0x1e0
> [  150.674170]  [] ? __debug_object_init+0x526/0xc40
> [  150.680543]  [] __warn+0x1c4/0x1e0
> [  150.685614]  [] warn_slowpath_null+0x2c/0x40
> [  150.691972]  [] __debug_object_init+0x526/0xc40
> [  150.698174]  [] ? debug_object_fixup+0x30/0x30
> [  150.704283]  [] debug_object_init_on_stack+0x19/0x20
> [  150.710917]  [] __wait_rcu_gp+0x93/0x1b0
> [  150.716508]  [] synchronize_rcu.part.65+0x101/0x110
> [  150.723054]  [] ? rcu_pm_notify+0xc0/0xc0
> [  150.728735]  [] ? __call_rcu.constprop.72+0x910/0x910
> [  150.735459]  [] ? __lock_is_held+0xa1/0xf0
> [  150.741223]  [] synchronize_rcu+0x27/0x90
>
> So this calls synchronize_rcu from a rcu callback. That's a nono. This is
> on the back of an interrupt in softirq context and __wait_rcu_gp() can
> sleep, which is obviously a bad idea in softirq context
>
> Cc'ed netdev 
>
> And that also explains the debug object splat because this is not running
> on the task stack. It's running on the softirq stack 
>
> [  150.746908]  [] __l2tp_session_unhash+0x3d5/0x550
> [  150.753281]  [] ? __l2tp_session_unhash+0x1bf/0x550
> [  150.759828]  [] ? __local_bh_enable_ip+0x6a/0xd0
> [  150.766123]  [] ? l2tp_udp_encap_recv+0xd90/0xd90
> [  150.772497]  [] l2tp_tunnel_closeall+0x1e7/0x3a0
> [  150.778782]  [] l2tp_tunnel_destruct+0x30e/0x5a0
> [  150.785067]  [] ? l2tp_tunnel_destruct+0x1aa/0x5a0
> [  150.791537]  [] ? l2tp_tunnel_del_work+0x460/0x460
> [  150.797997]  [] __sk_destruct+0x53/0x570
> [  150.803588]  [] rcu_process_callbacks+0x898/0x1300
> [  150.810048]  [] ? rcu_process_callbacks+0x977/0x1300
> [  150.816684]  [] ? __sk_dst_check+0x240/0x240
> [  150.822625]  [] __do_softirq+0x206/0x951
> [  150.828223]  [] irq_exit+0x165/0x190
> [  150.833557]  [] smp_apic_timer_interrupt+0x7b/0xa0
> [  150.840018]  [] apic_timer_interrupt+0xa0/0xb0
> [  150.846132]   [  150.848166]  [] ? 
> native_safe_halt+0x6/0x10
> [  150.854036]  [] ? trace_hardirqs_on+0xd/0x10
> [  150.859973]  [] default_idle+0x55/0x360
> [  150.865478]  [] arch_cpu_idle+0xa/0x10
> [  150.870896]  [] default_idle_call+0x36/0x60
> [  150.876751]  [] cpu_startup_entry+0x2b0/0x380
> [  150.882787]  [] ? cpu_in_idle+0x20/0x20
> [  150.888291]  [] ? clockevents_register_device+0x123/0x200
> [  150.895358]  [] start_secondary+0x303/0x3e0
> [  150.901209]  [] ? set_cpu_sibling_map+0x11f0/0x11f0

Thomas, thanks a lot. It appears this issue will not happen on
mainline since from commit  765924e362d1  (subject "l2tp: don't close
sessions in l2tp_tunnel_destruct()"), l2tp_tunnel_closeall is no
longer called from l2tp_tunnel_destruct. From that commit message it
seems one of the motivations is to solve scheduling from atomic issue.

However for this change to be applied to android-4.9 and/or 4.9
stable, it depends on several other l2p patches and they aren't
straight forward cherry-picks from mainline (and I don't have much
background with this driver).

v3.16.56 stable seems to be further along with l2tp than v4.9.89, in
that it atleast has more of the upstream patches adapted for it, that
the above patch depends on. Since this also related to stable, I am
CC'ing Greg kh and Ben.

Here are some of the commits in 3.16 stable that I couldn't find
applied to v4.9 stable. The above fix quotes the below patches as
dependencies so they would need to be stable backported. Also CC'ing
Guillaume since he authored the above mentioned fix.

0c15ddabbcf l2tp: don't register sessions in l2tp_session_create()
a3c5d5b70f4e l2tp: fix race condition in l2tp_tunnel_delete
5b216e8dcda2 l2tp: prevent creation of sessions on terminated tunnels
76ff5e22f1e0 l2tp: hold tunnel while looking up sessions in l2tp_netlink
ceb8f6b23a38 l2tp: define parameters of l2tp_session_get*() as "const"
0295d020b63f l2tp: initialise session's refcount before making it reachable
29a77518927e l2tp: take reference on sessions being dumped
b301c9b7782f l2tp: take a reference on sessions used in genetlink ha

[PATCH v4 2/4] samples/bpf: Enable cross compiler support

2017-09-20 Thread Joel Fernandes
When cross compiling, bpf samples use HOSTCC for compiling the non-BPF part of
the sample, however what we really want is to use the cross compiler to build
for the cross target since that is what will load and run the BPF sample.
Detect this and compile samples correctly.

Acked-by: Alexei Starovoitov 
Signed-off-by: Joel Fernandes 
---
 samples/bpf/Makefile | 5 +
 1 file changed, 5 insertions(+)

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index cf17c7932a6e..13f74b67ca44 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -177,6 +177,11 @@ HOSTLOADLIBES_syscall_tp += -lelf
 LLC ?= llc
 CLANG ?= clang
 
+# Detect that we're cross compiling and use the cross compiler
+ifdef CROSS_COMPILE
+HOSTCC = $(CROSS_COMPILE)gcc
+endif
+
 # Trick to allow make to be run from this directory
 all:
$(MAKE) -C ../../ $(CURDIR)/
-- 
2.14.1.821.g8fa685d3b7-goog



[PATCH v4 1/4] samples/bpf: Use getppid instead of getpgrp for array map stress

2017-09-20 Thread Joel Fernandes
When cross-compiling the bpf sample map_perf_test for aarch64, I find that
__NR_getpgrp is undefined. This causes build errors. This syscall is deprecated
and requires defining __ARCH_WANT_SYSCALL_DEPRECATED. To avoid having to define
that, just use a different syscall (getppid) for the array map stress test.

Acked-by: Alexei Starovoitov 
Signed-off-by: Joel Fernandes 
---
 samples/bpf/map_perf_test_kern.c | 2 +-
 samples/bpf/map_perf_test_user.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/samples/bpf/map_perf_test_kern.c b/samples/bpf/map_perf_test_kern.c
index 098c857f1eda..2b2ffb97018b 100644
--- a/samples/bpf/map_perf_test_kern.c
+++ b/samples/bpf/map_perf_test_kern.c
@@ -266,7 +266,7 @@ int stress_hash_map_lookup(struct pt_regs *ctx)
return 0;
 }
 
-SEC("kprobe/sys_getpgrp")
+SEC("kprobe/sys_getppid")
 int stress_array_map_lookup(struct pt_regs *ctx)
 {
u32 key = 1, i;
diff --git a/samples/bpf/map_perf_test_user.c b/samples/bpf/map_perf_test_user.c
index f388254896f6..a0310fc70057 100644
--- a/samples/bpf/map_perf_test_user.c
+++ b/samples/bpf/map_perf_test_user.c
@@ -282,7 +282,7 @@ static void test_array_lookup(int cpu)
 
start_time = time_get_ns();
for (i = 0; i < max_cnt; i++)
-   syscall(__NR_getpgrp, 0);
+   syscall(__NR_getppid, 0);
printf("%d:array_lookup %lld lookups per sec\n",
   cpu, max_cnt * 10ll * 64 / (time_get_ns() - start_time));
 }
-- 
2.14.1.821.g8fa685d3b7-goog



[PATCH v4 3/4] samples/bpf: Fix pt_regs issues when cross-compiling

2017-09-20 Thread Joel Fernandes
BPF samples fail to build when cross-compiling for ARM64 because of incorrect
pt_regs param selection. This is because clang defines __x86_64__ and
bpf_headers thinks we're building for x86. Since clang is building for the BPF
target, it shouldn't make assumptions about what target the BPF program is
going to run on. To fix this, lets pass ARCH so the header knows which target
the BPF program is being compiled for and can use the correct pt_regs code.

Acked-by: Alexei Starovoitov 
Signed-off-by: Joel Fernandes 
---
 samples/bpf/Makefile  |  2 +-
 tools/testing/selftests/bpf/bpf_helpers.h | 56 +++
 2 files changed, 50 insertions(+), 8 deletions(-)

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 13f74b67ca44..ebc2ad69b62c 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -230,7 +230,7 @@ $(obj)/%.o: $(src)/%.c
$(CLANG) $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(EXTRA_CFLAGS) -I$(obj) \
-I$(srctree)/tools/testing/selftests/bpf/ \
-D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value 
-Wno-pointer-sign \
-   -Wno-compare-distinct-pointer-types \
+   -D__TARGET_ARCH_$(ARCH) -Wno-compare-distinct-pointer-types \
-Wno-gnu-variable-sized-type-not-at-end \
-Wno-address-of-packed-member -Wno-tautological-compare \
-Wno-unknown-warning-option \
diff --git a/tools/testing/selftests/bpf/bpf_helpers.h 
b/tools/testing/selftests/bpf/bpf_helpers.h
index 36fb9161b34a..4875395b0b52 100644
--- a/tools/testing/selftests/bpf/bpf_helpers.h
+++ b/tools/testing/selftests/bpf/bpf_helpers.h
@@ -109,7 +109,47 @@ static int (*bpf_skb_under_cgroup)(void *ctx, void *map, 
int index) =
 static int (*bpf_skb_change_head)(void *, int len, int flags) =
(void *) BPF_FUNC_skb_change_head;
 
+/* Scan the ARCH passed in from ARCH env variable (see Makefile) */
+#if defined(__TARGET_ARCH_x86)
+   #define bpf_target_x86
+   #define bpf_target_defined
+#elif defined(__TARGET_ARCH_s930x)
+   #define bpf_target_s930x
+   #define bpf_target_defined
+#elif defined(__TARGET_ARCH_arm64)
+   #define bpf_target_arm64
+   #define bpf_target_defined
+#elif defined(__TARGET_ARCH_mips)
+   #define bpf_target_mips
+   #define bpf_target_defined
+#elif defined(__TARGET_ARCH_powerpc)
+   #define bpf_target_powerpc
+   #define bpf_target_defined
+#elif defined(__TARGET_ARCH_sparc)
+   #define bpf_target_sparc
+   #define bpf_target_defined
+#else
+   #undef bpf_target_defined
+#endif
+
+/* Fall back to what the compiler says */
+#ifndef bpf_target_defined
 #if defined(__x86_64__)
+   #define bpf_target_x86
+#elif defined(__s390x__)
+   #define bpf_target_s930x
+#elif defined(__aarch64__)
+   #define bpf_target_arm64
+#elif defined(__mips__)
+   #define bpf_target_mips
+#elif defined(__powerpc__)
+   #define bpf_target_powerpc
+#elif defined(__sparc__)
+   #define bpf_target_sparc
+#endif
+#endif
+
+#if defined(bpf_target_x86)
 
 #define PT_REGS_PARM1(x) ((x)->di)
 #define PT_REGS_PARM2(x) ((x)->si)
@@ -122,7 +162,7 @@ static int (*bpf_skb_change_head)(void *, int len, int 
flags) =
 #define PT_REGS_SP(x) ((x)->sp)
 #define PT_REGS_IP(x) ((x)->ip)
 
-#elif defined(__s390x__)
+#elif defined(bpf_target_s390x)
 
 #define PT_REGS_PARM1(x) ((x)->gprs[2])
 #define PT_REGS_PARM2(x) ((x)->gprs[3])
@@ -135,7 +175,7 @@ static int (*bpf_skb_change_head)(void *, int len, int 
flags) =
 #define PT_REGS_SP(x) ((x)->gprs[15])
 #define PT_REGS_IP(x) ((x)->psw.addr)
 
-#elif defined(__aarch64__)
+#elif defined(bpf_target_arm64)
 
 #define PT_REGS_PARM1(x) ((x)->regs[0])
 #define PT_REGS_PARM2(x) ((x)->regs[1])
@@ -148,7 +188,7 @@ static int (*bpf_skb_change_head)(void *, int len, int 
flags) =
 #define PT_REGS_SP(x) ((x)->sp)
 #define PT_REGS_IP(x) ((x)->pc)
 
-#elif defined(__mips__)
+#elif defined(bpf_target_mips)
 
 #define PT_REGS_PARM1(x) ((x)->regs[4])
 #define PT_REGS_PARM2(x) ((x)->regs[5])
@@ -161,7 +201,7 @@ static int (*bpf_skb_change_head)(void *, int len, int 
flags) =
 #define PT_REGS_SP(x) ((x)->regs[29])
 #define PT_REGS_IP(x) ((x)->cp0_epc)
 
-#elif defined(__powerpc__)
+#elif defined(bpf_target_powerpc)
 
 #define PT_REGS_PARM1(x) ((x)->gpr[3])
 #define PT_REGS_PARM2(x) ((x)->gpr[4])
@@ -172,7 +212,7 @@ static int (*bpf_skb_change_head)(void *, int len, int 
flags) =
 #define PT_REGS_SP(x) ((x)->sp)
 #define PT_REGS_IP(x) ((x)->nip)
 
-#elif defined(__sparc__)
+#elif defined(bpf_target_sparc)
 
 #define PT_REGS_PARM1(x) ((x)->u_regs[UREG_I0])
 #define PT_REGS_PARM2(x) ((x)->u_regs[UREG_I1])
@@ -182,6 +222,8 @@ static int (*bpf_skb_change_head)(void *, int len, int 
flags) =
 #define PT_REGS_RET(x) ((x)->u_regs[UREG_I7])
 #define PT_REGS_RC(x) ((x)->u_regs[UREG_I0])
 #define PT_REGS_SP(x) ((x)->u_regs[UREG_FP])
+
+/* 

[PATCH v4 4/4] samples/bpf: Add documentation on cross compilation

2017-09-20 Thread Joel Fernandes
Acked-by: Alexei Starovoitov 
Signed-off-by: Joel Fernandes 
---
 samples/bpf/README.rst | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/samples/bpf/README.rst b/samples/bpf/README.rst
index 79f9a58f1872..2b906127ef54 100644
--- a/samples/bpf/README.rst
+++ b/samples/bpf/README.rst
@@ -64,3 +64,13 @@ It is also possible to point make to the newly compiled 
'llc' or
 'clang' command via redefining LLC or CLANG on the make command line::
 
  make samples/bpf/ LLC=~/git/llvm/build/bin/llc 
CLANG=~/git/llvm/build/bin/clang
+
+Cross compiling samples
+---
+Inorder to cross-compile, say for arm64 targets, export CROSS_COMPILE and ARCH
+environment variables before calling make. This will direct make to build
+samples for the cross target.
+
+export ARCH=arm64
+export CROSS_COMPILE="aarch64-linux-gnu-"
+make samples/bpf/ LLC=~/git/llvm/build/bin/llc CLANG=~/git/llvm/build/bin/clang
-- 
2.14.1.821.g8fa685d3b7-goog



[PATCH v4 0/4] Add cross-compilation support to eBPF samples

2017-09-20 Thread Joel Fernandes
These patches fix issues seen when cross-compiling eBPF samples on arm64.
Compared to [1], I dropped the controversial inline-asm patch and exploring
other options to fix it. However these patches are a step in the right
direction and I look forward to getting them into -next and the merge window.

Changes since v3:
- just a repost with acks

[1] https://lkml.org/lkml/2017/8/7/417

Joel Fernandes (4):
  samples/bpf: Use getppid instead of getpgrp for array map stress
  samples/bpf: Enable cross compiler support
  samples/bpf: Fix pt_regs issues when cross-compiling
  samples/bpf: Add documentation on cross compilation

 samples/bpf/Makefile  |  7 +++-
 samples/bpf/README.rst| 10 ++
 samples/bpf/map_perf_test_kern.c  |  2 +-
 samples/bpf/map_perf_test_user.c  |  2 +-
 tools/testing/selftests/bpf/bpf_helpers.h | 56 +++
 5 files changed, 67 insertions(+), 10 deletions(-)

-- 
2.14.1.821.g8fa685d3b7-goog



Re: [PATCH RFC v3 0/4] Add cross-compilation support to eBPF samples

2017-09-05 Thread Joel Fernandes
Hi Alexei,

On Tue, Sep 5, 2017 at 8:24 PM, Alexei Starovoitov
 wrote:
> On Sun, Sep 03, 2017 at 11:23:21AM -0700, Joel Fernandes wrote:
>> These patches fix issues seen when cross-compiling eBPF samples on arm64.
>> Compared to [1], I dropped the controversial inline-asm patch pending further
>> discussion on the right way to do it. However these patches are still a step 
>> in
>> the right direction and I wanted them to get in before the more controversial
>> bit.
>
> All patches in this set look good to me.
> When you repost without changes after net-next reopens pls keep my:
> Acked-by: Alexei Starovoitov 

Thanks a lot! will do.

Regards,
Joel


[PATCH RFC v3 1/4] samples/bpf: Use getppid instead of getpgrp for array map stress

2017-09-03 Thread Joel Fernandes
When cross-compiling the bpf sample map_perf_test for aarch64, I find that
__NR_getpgrp is undefined. This causes build errors. This syscall is deprecated
and requires defining __ARCH_WANT_SYSCALL_DEPRECATED. To avoid having to define
that, just use a different syscall (getppid) for the array map stress test.

CC: Alexei Starovoitov 
CC: Daniel Borkmann 
Cc: David Miller 
Signed-off-by: Joel Fernandes 
---
 samples/bpf/map_perf_test_kern.c | 2 +-
 samples/bpf/map_perf_test_user.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/samples/bpf/map_perf_test_kern.c b/samples/bpf/map_perf_test_kern.c
index 098c857f1eda..2b2ffb97018b 100644
--- a/samples/bpf/map_perf_test_kern.c
+++ b/samples/bpf/map_perf_test_kern.c
@@ -266,7 +266,7 @@ int stress_hash_map_lookup(struct pt_regs *ctx)
return 0;
 }
 
-SEC("kprobe/sys_getpgrp")
+SEC("kprobe/sys_getppid")
 int stress_array_map_lookup(struct pt_regs *ctx)
 {
u32 key = 1, i;
diff --git a/samples/bpf/map_perf_test_user.c b/samples/bpf/map_perf_test_user.c
index f388254896f6..a0310fc70057 100644
--- a/samples/bpf/map_perf_test_user.c
+++ b/samples/bpf/map_perf_test_user.c
@@ -282,7 +282,7 @@ static void test_array_lookup(int cpu)
 
start_time = time_get_ns();
for (i = 0; i < max_cnt; i++)
-   syscall(__NR_getpgrp, 0);
+   syscall(__NR_getppid, 0);
printf("%d:array_lookup %lld lookups per sec\n",
   cpu, max_cnt * 10ll * 64 / (time_get_ns() - start_time));
 }
-- 
2.14.1.581.gf28d330327-goog



[PATCH RFC v3 3/4] samples/bpf: Fix pt_regs issues when cross-compiling

2017-09-03 Thread Joel Fernandes
BPF samples fail to build when cross-compiling for ARM64 because of incorrect
pt_regs param selection. This is because clang defines __x86_64__ and
bpf_headers thinks we're building for x86. Since clang is building for the BPF
target, it shouldn't make assumptions about what target the BPF program is
going to run on. To fix this, lets pass ARCH so the header knows which target
the BPF program is being compiled for and can use the correct pt_regs code.

CC: Alexei Starovoitov 
CC: Daniel Borkmann 
Cc: David Miller 
Signed-off-by: Joel Fernandes 
---
 samples/bpf/Makefile  |  2 +-
 tools/testing/selftests/bpf/bpf_helpers.h | 56 +++
 2 files changed, 50 insertions(+), 8 deletions(-)

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 13f74b67ca44..ebc2ad69b62c 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -230,7 +230,7 @@ $(obj)/%.o: $(src)/%.c
$(CLANG) $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(EXTRA_CFLAGS) -I$(obj) \
-I$(srctree)/tools/testing/selftests/bpf/ \
-D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value 
-Wno-pointer-sign \
-   -Wno-compare-distinct-pointer-types \
+   -D__TARGET_ARCH_$(ARCH) -Wno-compare-distinct-pointer-types \
-Wno-gnu-variable-sized-type-not-at-end \
-Wno-address-of-packed-member -Wno-tautological-compare \
-Wno-unknown-warning-option \
diff --git a/tools/testing/selftests/bpf/bpf_helpers.h 
b/tools/testing/selftests/bpf/bpf_helpers.h
index 36fb9161b34a..4875395b0b52 100644
--- a/tools/testing/selftests/bpf/bpf_helpers.h
+++ b/tools/testing/selftests/bpf/bpf_helpers.h
@@ -109,7 +109,47 @@ static int (*bpf_skb_under_cgroup)(void *ctx, void *map, 
int index) =
 static int (*bpf_skb_change_head)(void *, int len, int flags) =
(void *) BPF_FUNC_skb_change_head;
 
+/* Scan the ARCH passed in from ARCH env variable (see Makefile) */
+#if defined(__TARGET_ARCH_x86)
+   #define bpf_target_x86
+   #define bpf_target_defined
+#elif defined(__TARGET_ARCH_s930x)
+   #define bpf_target_s930x
+   #define bpf_target_defined
+#elif defined(__TARGET_ARCH_arm64)
+   #define bpf_target_arm64
+   #define bpf_target_defined
+#elif defined(__TARGET_ARCH_mips)
+   #define bpf_target_mips
+   #define bpf_target_defined
+#elif defined(__TARGET_ARCH_powerpc)
+   #define bpf_target_powerpc
+   #define bpf_target_defined
+#elif defined(__TARGET_ARCH_sparc)
+   #define bpf_target_sparc
+   #define bpf_target_defined
+#else
+   #undef bpf_target_defined
+#endif
+
+/* Fall back to what the compiler says */
+#ifndef bpf_target_defined
 #if defined(__x86_64__)
+   #define bpf_target_x86
+#elif defined(__s390x__)
+   #define bpf_target_s930x
+#elif defined(__aarch64__)
+   #define bpf_target_arm64
+#elif defined(__mips__)
+   #define bpf_target_mips
+#elif defined(__powerpc__)
+   #define bpf_target_powerpc
+#elif defined(__sparc__)
+   #define bpf_target_sparc
+#endif
+#endif
+
+#if defined(bpf_target_x86)
 
 #define PT_REGS_PARM1(x) ((x)->di)
 #define PT_REGS_PARM2(x) ((x)->si)
@@ -122,7 +162,7 @@ static int (*bpf_skb_change_head)(void *, int len, int 
flags) =
 #define PT_REGS_SP(x) ((x)->sp)
 #define PT_REGS_IP(x) ((x)->ip)
 
-#elif defined(__s390x__)
+#elif defined(bpf_target_s390x)
 
 #define PT_REGS_PARM1(x) ((x)->gprs[2])
 #define PT_REGS_PARM2(x) ((x)->gprs[3])
@@ -135,7 +175,7 @@ static int (*bpf_skb_change_head)(void *, int len, int 
flags) =
 #define PT_REGS_SP(x) ((x)->gprs[15])
 #define PT_REGS_IP(x) ((x)->psw.addr)
 
-#elif defined(__aarch64__)
+#elif defined(bpf_target_arm64)
 
 #define PT_REGS_PARM1(x) ((x)->regs[0])
 #define PT_REGS_PARM2(x) ((x)->regs[1])
@@ -148,7 +188,7 @@ static int (*bpf_skb_change_head)(void *, int len, int 
flags) =
 #define PT_REGS_SP(x) ((x)->sp)
 #define PT_REGS_IP(x) ((x)->pc)
 
-#elif defined(__mips__)
+#elif defined(bpf_target_mips)
 
 #define PT_REGS_PARM1(x) ((x)->regs[4])
 #define PT_REGS_PARM2(x) ((x)->regs[5])
@@ -161,7 +201,7 @@ static int (*bpf_skb_change_head)(void *, int len, int 
flags) =
 #define PT_REGS_SP(x) ((x)->regs[29])
 #define PT_REGS_IP(x) ((x)->cp0_epc)
 
-#elif defined(__powerpc__)
+#elif defined(bpf_target_powerpc)
 
 #define PT_REGS_PARM1(x) ((x)->gpr[3])
 #define PT_REGS_PARM2(x) ((x)->gpr[4])
@@ -172,7 +212,7 @@ static int (*bpf_skb_change_head)(void *, int len, int 
flags) =
 #define PT_REGS_SP(x) ((x)->sp)
 #define PT_REGS_IP(x) ((x)->nip)
 
-#elif defined(__sparc__)
+#elif defined(bpf_target_sparc)
 
 #define PT_REGS_PARM1(x) ((x)->u_regs[UREG_I0])
 #define PT_REGS_PARM2(x) ((x)->u_regs[UREG_I1])
@@ -182,6 +222,8 @@ static int (*bpf_skb_change_head)(void *, int len, int 
flags) =
 #define PT_REGS_RET(x) ((x)->u_regs[UREG_I7])
 #define PT_REGS_RC(x) ((x)->u_regs[UREG_I0])
 #define PT_REGS_SP(x)

[PATCH RFC v3 2/4] samples/bpf: Enable cross compiler support

2017-09-03 Thread Joel Fernandes
When cross compiling, bpf samples use HOSTCC for compiling the non-BPF part of
the sample, however what we really want is to use the cross compiler to build
for the cross target since that is what will load and run the BPF sample.
Detect this and compile samples correctly.

CC: Alexei Starovoitov 
CC: Daniel Borkmann 
Cc: David Miller 
Signed-off-by: Joel Fernandes 
---
 samples/bpf/Makefile | 5 +
 1 file changed, 5 insertions(+)

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index cf17c7932a6e..13f74b67ca44 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -177,6 +177,11 @@ HOSTLOADLIBES_syscall_tp += -lelf
 LLC ?= llc
 CLANG ?= clang
 
+# Detect that we're cross compiling and use the cross compiler
+ifdef CROSS_COMPILE
+HOSTCC = $(CROSS_COMPILE)gcc
+endif
+
 # Trick to allow make to be run from this directory
 all:
$(MAKE) -C ../../ $(CURDIR)/
-- 
2.14.1.581.gf28d330327-goog



[PATCH RFC v3 4/4] samples/bpf: Add documentation on cross compilation

2017-09-03 Thread Joel Fernandes
CC: Alexei Starovoitov 
CC: Daniel Borkmann 
CC: David Miller 
Signed-off-by: Joel Fernandes 
---
 samples/bpf/README.rst | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/samples/bpf/README.rst b/samples/bpf/README.rst
index 79f9a58f1872..2b906127ef54 100644
--- a/samples/bpf/README.rst
+++ b/samples/bpf/README.rst
@@ -64,3 +64,13 @@ It is also possible to point make to the newly compiled 
'llc' or
 'clang' command via redefining LLC or CLANG on the make command line::
 
  make samples/bpf/ LLC=~/git/llvm/build/bin/llc 
CLANG=~/git/llvm/build/bin/clang
+
+Cross compiling samples
+---
+Inorder to cross-compile, say for arm64 targets, export CROSS_COMPILE and ARCH
+environment variables before calling make. This will direct make to build
+samples for the cross target.
+
+export ARCH=arm64
+export CROSS_COMPILE="aarch64-linux-gnu-"
+make samples/bpf/ LLC=~/git/llvm/build/bin/llc CLANG=~/git/llvm/build/bin/clang
-- 
2.14.1.581.gf28d330327-goog



[PATCH RFC v3 0/4] Add cross-compilation support to eBPF samples

2017-09-03 Thread Joel Fernandes
These patches fix issues seen when cross-compiling eBPF samples on arm64.
Compared to [1], I dropped the controversial inline-asm patch pending further
discussion on the right way to do it. However these patches are still a step in
the right direction and I wanted them to get in before the more controversial
bit.

[1] https://lkml.org/lkml/2017/8/7/417

Joel Fernandes (4):
  samples/bpf: Use getppid instead of getpgrp for array map stress
  samples/bpf: Enable cross compiler support
  samples/bpf: Fix pt_regs issues when cross-compiling
  samples/bpf: Add documentation on cross compilation

 samples/bpf/Makefile  |  7 +++-
 samples/bpf/README.rst| 10 ++
 samples/bpf/map_perf_test_kern.c  |  2 +-
 samples/bpf/map_perf_test_user.c  |  2 +-
 tools/testing/selftests/bpf/bpf_helpers.h | 56 +++
 5 files changed, 67 insertions(+), 10 deletions(-)

CC: Alexei Starovoitov 
CC: Daniel Borkmann 
Cc: David Miller 
-- 
2.14.1.581.gf28d330327-goog



Re: [PATCH RFC v2 3/5] samples/bpf: Fix inline asm issues building samples on arm64

2017-08-17 Thread Joel Fernandes
On Tue, Aug 8, 2017 at 8:35 PM, David Miller  wrote:
> From: Joel Fernandes 
> Date: Mon, 7 Aug 2017 18:20:49 -0700
>
>> On Mon, Aug 7, 2017 at 11:28 AM, David Miller  wrote:
>>> The amount of hellish hacks we are adding to deal with this is getting
>>> way out of control.
>>
>> I agree with you that hellish hacks are being added which is why it
>> keeps breaking. I think one of the things my series does is to add
>> back inclusion of asm headers that were previously removed (that is
>> the worst hellish hack in my opinion that existing in mainline). So in
>> that respect my patch is an improvement and makes it possible to build
>> for arm64 platforms (which is currently broken in mainline).
>
> Yeah that is a problem.
>
> Perhaps another avenue of attack is to separate "type" header files from
> stuff that has functiond declarations and inline assembler code.

I was thinking that's probably a huge undertaking if you meant doing
the above for every arch?

Also another way could be to modify clang to ignore inline asm
directives during compilation? Do you have any comments about such
approach?

thanks,

-Joel


Re: [PATCH RFC v2 3/5] samples/bpf: Fix inline asm issues building samples on arm64

2017-08-07 Thread Joel Fernandes
Hi Dave,

On Mon, Aug 7, 2017 at 11:28 AM, David Miller  wrote:
>
> Please, no.

Sorry you dislike it, I had intentionally marked it as RFC as its an
idea I was just toying with the idea and posted it early to get
feedback.

>
> The amount of hellish hacks we are adding to deal with this is getting
> way out of control.

I agree with you that hellish hacks are being added which is why it
keeps breaking. I think one of the things my series does is to add
back inclusion of asm headers that were previously removed (that is
the worst hellish hack in my opinion that existing in mainline). So in
that respect my patch is an improvement and makes it possible to build
for arm64 platforms (which is currently broken in mainline).

>
> BPF programs MUST have their own set of asm headers, this is the
> only way to get around this issue in the long term.

Wouldn't that break scripts or bpf code that instruments/trace arch
specific code?

>
> I am also strongly against adding -static to the build.

I can drop -static if you prefer, that's not an issue.

As I understand it, there are no other cleaner alternatives and this
patchset makes the samples work. I would even argue that's its more
functional than previous attempts and fixes something broken in
mainline in a more generic way. If you can provide an example of where
my patchset may not work, I would love to hear it. My whole idea was
to do it in a way that makes future breakage not happen. I don't think
that leaving things broken in this state for extended periods of time
makes sense and IMHO will slow usage of bpf samples on other
platforms.

thanks,

-Joel


Re: [PATCH RFC v2 3/5] samples/bpf: Fix inline asm issues building samples on arm64

2017-08-07 Thread Joel Fernandes
On Mon, Aug 7, 2017 at 6:06 AM, Joel Fernandes  wrote:
> inline assembly has haunted building samples on arm64 for quite sometime.
> This patch uses the pre-processor to noop all occurences of inline asm when
> compiling the BPF sample for the BPF target.
>
> This patch reintroduces inclusion of asm/sysregs.h which needs to be included
> to avoid compiler errors now, see [1]. Previously a hack prevented this
> inclusion [2] (to avoid the exact problem this patch fixes - skipping inline
> assembler) but the hack causes other errors now and no longer works.
>
> Using the preprocessor to noop the inline asm occurences, we also avoid
> any future unstable hackery needed (such as those that skip asm headers)
> and provides information that asm headers may have which could have been
> used but which the inline asm issues prevented. This is the least messy
> of all hacks in my opinion.
>
> [1] https://lkml.org/lkml/2017/8/5/143
> [2] https://lists.linaro.org/pipermail/linaro-kernel/2015-November/024036.html
>
> Signed-off-by: Joel Fernandes 
> ---
>  samples/bpf/Makefile   | 40 +---
>  samples/bpf/arm64_asmstubs.h   |  3 +++
>  samples/bpf/bpf_helpers.h  | 12 ++--
>  samples/bpf/generic_asmstubs.h |  4 
>  4 files changed, 46 insertions(+), 13 deletions(-)
>  create mode 100644 samples/bpf/arm64_asmstubs.h
>  create mode 100644 samples/bpf/generic_asmstubs.h
>
> diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
> index e5642c8c144d..7591cdd7fe69 100644
> --- a/samples/bpf/Makefile
> +++ b/samples/bpf/Makefile
> @@ -151,6 +151,8 @@ HOSTLOADLIBES_test_map_in_map += -lelf
>  #  make samples/bpf/ LLC=~/git/llvm/build/bin/llc 
> CLANG=~/git/llvm/build/bin/clang
>  LLC ?= llc
>  CLANG ?= clang
> +PERL ?= perl
> +RM ?= rm
>
>  # Detect that we're cross compiling and use the right compilers and flags
>  ifdef CROSS_COMPILE
> @@ -186,14 +188,38 @@ verify_target_bpf: verify_cmds
>
>  $(src)/*.c: verify_target_bpf
>
> -# asm/sysreg.h - inline assembly used by it is incompatible with llvm.
> -# But, there is no easy way to fix it, so just exclude it since it is
> -# useless for BPF samples.
> -$(obj)/%.o: $(src)/%.c
> -   $(CLANG) $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(EXTRA_CFLAGS) \
> -   -D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value 
> -Wno-pointer-sign \
> +curdir := $(shell dirname $(realpath $(lastword $(MAKEFILE_LIST
> +ifeq ($(wildcard  $(curdir)/${ARCH}_asmstubs.h),)
> +ARCH_ASM_STUBS :=
> +else
> +ARCH_ASM_STUBS := -include $(src)/${ARCH}_asmstubs.h
> +endif
> +
> +ASM_STUBS := ${ARCH_ASM_STUBS} -include $(src)/generic_asmstubs.h
> +
> +CLANG_ARGS = $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(EXTRA_CFLAGS) \
> +   -D__KERNEL__ -Wno-unused-value -Wno-pointer-sign \
> +   $(ASM_STUBS) \
> -Wno-compare-distinct-pointer-types \
> -Wno-gnu-variable-sized-type-not-at-end \
> -Wno-address-of-packed-member -Wno-tautological-compare \
> -Wno-unknown-warning-option \
> -   -O2 -emit-llvm -c $< -o -| $(LLC) -march=bpf -filetype=obj -o 
> $@
> +   -O2 -emit-llvm
> +
> +$(obj)/%.o: $(src)/%.c
> +   # Steps to compile BPF sample while getting rid of inline asm
> +   # This has the advantage of not having to skip important asm headers
> +   # Step 1. Use clang preprocessor to stub out asm() calls
> +   # Step 2. Replace all "asm volatile" with single keyword "asmvolatile"
> +   # Step 3. Use clang preprocessor to noop all asm volatile() calls
> +   # and restore asm_bpf to asm for BPF's asm directives
> +   # Step 4. Compile and link
> +
> +   $(CLANG) -E $(CLANG_ARGS) -c $< -o - | \
> +   $(PERL) -pe "s/[_\s]*asm[_\s]*volatile[_\s]*/asmvolatile/g" | \
> +   $(CLANG) -E $(ASM_STUBS) - -o - | \
> +   $(CLANG) -E -Dasm_bpf=asm - -o $@.tmp.c
> +
> +   $(CLANG) $(CLANG_ARGS) -c $@.tmp.c \
> +   -o - | $(LLC) -march=bpf -filetype=obj -o $@

Just found an issue here that asm_bpf will be stubbed out by
CLANG_ARGS, will fix it next rev. I'll also try to do object
inspection to make sure the asm directive bpf_helpers is working.
thanks!

-Joel



> +   $(RM) $@.tmp.c
> diff --git a/samples/bpf/arm64_asmstubs.h b/samples/bpf/arm64_asmstubs.h
> new file mode 100644
> index ..23d47dbe61b1
> --- /dev/null
> +++ b/samples/bpf/arm64_asmstubs.h
> @@ -0,0 +1,3 @@
> +/* Special handing for current_stack_pointer */
> +#define __ASM_STACK_POINTER_H
> +#define current_stack_pointer 0
> diff --git a/samples/bpf

[PATCH RFC v2 2/5] samples/bpf: Enable cross compiler support

2017-08-07 Thread Joel Fernandes
When cross compiling, bpf samples use HOSTCC, however what we really want is to
use the cross compiler to build for the cross target since that is what will
help run the BPF target code.  Detect this and also set -static as LDFLAGS
since often times we don't have control over what C library the cross target is
running and its not smart to rely on it.

Signed-off-by: Joel Fernandes 
---
 samples/bpf/Makefile | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 6c7468eb3684..e5642c8c144d 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -152,6 +152,12 @@ HOSTLOADLIBES_test_map_in_map += -lelf
 LLC ?= llc
 CLANG ?= clang
 
+# Detect that we're cross compiling and use the right compilers and flags
+ifdef CROSS_COMPILE
+HOSTCC = $(CROSS_COMPILE)gcc
+HOSTLDFLAGS += -static
+endif
+
 # Trick to allow make to be run from this directory
 all:
$(MAKE) -C ../../ $(CURDIR)/
-- 
2.14.0.rc1.383.gd1ce394fe2-goog



[PATCH RFC v2 1/5] samples/bpf: Use getppid instead of getpgrp for array map stress

2017-08-07 Thread Joel Fernandes
When cross-compiling the bpf sample map_perf_test for aarch64, I find that
__NR_getpgrp is undefined. This causes build errors. This syscall is deprecated
and requires defining __ARCH_WANT_SYSCALL_DEPRECATED. To avoid having to define
that, just use a different syscall (getppid) for the array map stress test.

Signed-off-by: Joel Fernandes 
---
 samples/bpf/map_perf_test_kern.c | 2 +-
 samples/bpf/map_perf_test_user.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/samples/bpf/map_perf_test_kern.c b/samples/bpf/map_perf_test_kern.c
index 245165817fbe..038ffec295cf 100644
--- a/samples/bpf/map_perf_test_kern.c
+++ b/samples/bpf/map_perf_test_kern.c
@@ -232,7 +232,7 @@ int stress_hash_map_lookup(struct pt_regs *ctx)
return 0;
 }
 
-SEC("kprobe/sys_getpgrp")
+SEC("kprobe/sys_getppid")
 int stress_array_map_lookup(struct pt_regs *ctx)
 {
u32 key = 1, i;
diff --git a/samples/bpf/map_perf_test_user.c b/samples/bpf/map_perf_test_user.c
index 1a8894b5ac51..1e9e68942197 100644
--- a/samples/bpf/map_perf_test_user.c
+++ b/samples/bpf/map_perf_test_user.c
@@ -232,7 +232,7 @@ static void test_array_lookup(int cpu)
 
start_time = time_get_ns();
for (i = 0; i < max_cnt; i++)
-   syscall(__NR_getpgrp, 0);
+   syscall(__NR_getppid, 0);
printf("%d:array_lookup %lld lookups per sec\n",
   cpu, max_cnt * 10ll * 64 / (time_get_ns() - start_time));
 }
-- 
2.14.0.rc1.383.gd1ce394fe2-goog



[PATCH RFC v2 3/5] samples/bpf: Fix inline asm issues building samples on arm64

2017-08-07 Thread Joel Fernandes
inline assembly has haunted building samples on arm64 for quite sometime.
This patch uses the pre-processor to noop all occurences of inline asm when
compiling the BPF sample for the BPF target.

This patch reintroduces inclusion of asm/sysregs.h which needs to be included
to avoid compiler errors now, see [1]. Previously a hack prevented this
inclusion [2] (to avoid the exact problem this patch fixes - skipping inline
assembler) but the hack causes other errors now and no longer works.

Using the preprocessor to noop the inline asm occurences, we also avoid
any future unstable hackery needed (such as those that skip asm headers)
and provides information that asm headers may have which could have been
used but which the inline asm issues prevented. This is the least messy
of all hacks in my opinion.

[1] https://lkml.org/lkml/2017/8/5/143
[2] https://lists.linaro.org/pipermail/linaro-kernel/2015-November/024036.html

Signed-off-by: Joel Fernandes 
---
 samples/bpf/Makefile   | 40 +---
 samples/bpf/arm64_asmstubs.h   |  3 +++
 samples/bpf/bpf_helpers.h  | 12 ++--
 samples/bpf/generic_asmstubs.h |  4 
 4 files changed, 46 insertions(+), 13 deletions(-)
 create mode 100644 samples/bpf/arm64_asmstubs.h
 create mode 100644 samples/bpf/generic_asmstubs.h

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index e5642c8c144d..7591cdd7fe69 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -151,6 +151,8 @@ HOSTLOADLIBES_test_map_in_map += -lelf
 #  make samples/bpf/ LLC=~/git/llvm/build/bin/llc 
CLANG=~/git/llvm/build/bin/clang
 LLC ?= llc
 CLANG ?= clang
+PERL ?= perl
+RM ?= rm
 
 # Detect that we're cross compiling and use the right compilers and flags
 ifdef CROSS_COMPILE
@@ -186,14 +188,38 @@ verify_target_bpf: verify_cmds
 
 $(src)/*.c: verify_target_bpf
 
-# asm/sysreg.h - inline assembly used by it is incompatible with llvm.
-# But, there is no easy way to fix it, so just exclude it since it is
-# useless for BPF samples.
-$(obj)/%.o: $(src)/%.c
-   $(CLANG) $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(EXTRA_CFLAGS) \
-   -D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value 
-Wno-pointer-sign \
+curdir := $(shell dirname $(realpath $(lastword $(MAKEFILE_LIST
+ifeq ($(wildcard  $(curdir)/${ARCH}_asmstubs.h),)
+ARCH_ASM_STUBS :=
+else
+ARCH_ASM_STUBS := -include $(src)/${ARCH}_asmstubs.h
+endif
+
+ASM_STUBS := ${ARCH_ASM_STUBS} -include $(src)/generic_asmstubs.h
+
+CLANG_ARGS = $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(EXTRA_CFLAGS) \
+   -D__KERNEL__ -Wno-unused-value -Wno-pointer-sign \
+   $(ASM_STUBS) \
-Wno-compare-distinct-pointer-types \
-Wno-gnu-variable-sized-type-not-at-end \
-Wno-address-of-packed-member -Wno-tautological-compare \
-Wno-unknown-warning-option \
-   -O2 -emit-llvm -c $< -o -| $(LLC) -march=bpf -filetype=obj -o $@
+   -O2 -emit-llvm
+
+$(obj)/%.o: $(src)/%.c
+   # Steps to compile BPF sample while getting rid of inline asm
+   # This has the advantage of not having to skip important asm headers
+   # Step 1. Use clang preprocessor to stub out asm() calls
+   # Step 2. Replace all "asm volatile" with single keyword "asmvolatile"
+   # Step 3. Use clang preprocessor to noop all asm volatile() calls
+   # and restore asm_bpf to asm for BPF's asm directives
+   # Step 4. Compile and link
+
+   $(CLANG) -E $(CLANG_ARGS) -c $< -o - | \
+   $(PERL) -pe "s/[_\s]*asm[_\s]*volatile[_\s]*/asmvolatile/g" | \
+   $(CLANG) -E $(ASM_STUBS) - -o - | \
+   $(CLANG) -E -Dasm_bpf=asm - -o $@.tmp.c
+
+   $(CLANG) $(CLANG_ARGS) -c $@.tmp.c \
+   -o - | $(LLC) -march=bpf -filetype=obj -o $@
+   $(RM) $@.tmp.c
diff --git a/samples/bpf/arm64_asmstubs.h b/samples/bpf/arm64_asmstubs.h
new file mode 100644
index ..23d47dbe61b1
--- /dev/null
+++ b/samples/bpf/arm64_asmstubs.h
@@ -0,0 +1,3 @@
+/* Special handing for current_stack_pointer */
+#define __ASM_STACK_POINTER_H
+#define current_stack_pointer 0
diff --git a/samples/bpf/bpf_helpers.h b/samples/bpf/bpf_helpers.h
index 9a9c95f2c9fb..67c9c4438e4b 100644
--- a/samples/bpf/bpf_helpers.h
+++ b/samples/bpf/bpf_helpers.h
@@ -64,12 +64,12 @@ static int (*bpf_xdp_adjust_head)(void *ctx, int offset) =
  * emit BPF_LD_ABS and BPF_LD_IND instructions
  */
 struct sk_buff;
-unsigned long long load_byte(void *skb,
-unsigned long long off) asm("llvm.bpf.load.byte");
-unsigned long long load_half(void *skb,
-unsigned long long off) asm("llvm.bpf.load.half");
-unsigned long long load_word(void *skb,
-unsigned long long off) asm("llvm.bpf.load.word");
+unsigned long long load_byte(void *skb, unsigned long long off)
+ 

[PATCH RFC v2 5/5] samples/bpf: Add documentation on cross compilation

2017-08-07 Thread Joel Fernandes
Signed-off-by: Joel Fernandes 
---
 samples/bpf/README.rst | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/samples/bpf/README.rst b/samples/bpf/README.rst
index 79f9a58f1872..2b906127ef54 100644
--- a/samples/bpf/README.rst
+++ b/samples/bpf/README.rst
@@ -64,3 +64,13 @@ It is also possible to point make to the newly compiled 
'llc' or
 'clang' command via redefining LLC or CLANG on the make command line::
 
  make samples/bpf/ LLC=~/git/llvm/build/bin/llc 
CLANG=~/git/llvm/build/bin/clang
+
+Cross compiling samples
+---
+Inorder to cross-compile, say for arm64 targets, export CROSS_COMPILE and ARCH
+environment variables before calling make. This will direct make to build
+samples for the cross target.
+
+export ARCH=arm64
+export CROSS_COMPILE="aarch64-linux-gnu-"
+make samples/bpf/ LLC=~/git/llvm/build/bin/llc CLANG=~/git/llvm/build/bin/clang
-- 
2.14.0.rc1.383.gd1ce394fe2-goog



[PATCH RFC v2 4/5] samples/bpf: Fix pt_regs issues when cross-compiling

2017-08-07 Thread Joel Fernandes
BPF samples fail to build when cross-compiling for ARM64 because of incorrect
pt_regs param selection. This is because clang defines __x86_64__ and
bpf_headers thinks we're building for x86. Since clang is building for the BPF
target, it shouldn't make assumptions about what target the BPF program is
going to run on. To fix this, lets pass ARCH so the header knows which target
the BPF program is being compiled for and can use the correct pt_regs code.

Signed-off-by: Joel Fernandes 
---
 samples/bpf/Makefile  |  2 +-
 samples/bpf/bpf_helpers.h | 49 +--
 2 files changed, 44 insertions(+), 7 deletions(-)

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 7591cdd7fe69..8cbcaffe4001 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -199,7 +199,7 @@ ASM_STUBS := ${ARCH_ASM_STUBS} -include 
$(src)/generic_asmstubs.h
 
 CLANG_ARGS = $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(EXTRA_CFLAGS) \
-D__KERNEL__ -Wno-unused-value -Wno-pointer-sign \
-   $(ASM_STUBS) \
+   -D__TARGET_ARCH_$(ARCH) $(ASM_STUBS) \
-Wno-compare-distinct-pointer-types \
-Wno-gnu-variable-sized-type-not-at-end \
-Wno-address-of-packed-member -Wno-tautological-compare \
diff --git a/samples/bpf/bpf_helpers.h b/samples/bpf/bpf_helpers.h
index 67c9c4438e4b..199d2e32703a 100644
--- a/samples/bpf/bpf_helpers.h
+++ b/samples/bpf/bpf_helpers.h
@@ -96,7 +96,42 @@ static int (*bpf_skb_under_cgroup)(void *ctx, void *map, int 
index) =
 static int (*bpf_skb_change_head)(void *, int len, int flags) =
(void *) BPF_FUNC_skb_change_head;
 
+/* Scan the ARCH passed in from ARCH env variable (see Makefile) */
+#if defined(__TARGET_ARCH_x86)
+   #define bpf_target_x86
+   #define bpf_target_defined
+#elif defined(__TARGET_ARCH_s930x)
+   #define bpf_target_s930x
+   #define bpf_target_defined
+#elif defined(__TARGET_ARCH_arm64)
+   #define bpf_target_arm64
+   #define bpf_target_defined
+#elif defined(__TARGET_ARCH_powerpc)
+   #define bpf_target_powerpc
+   #define bpf_target_defined
+#elif defined(__TARGET_ARCH_sparc)
+   #define bpf_target_sparc
+   #define bpf_target_defined
+#else
+   #undef bpf_target_defined
+#endif
+
+/* Fall back to what the compiler says */
+#ifndef bpf_target_defined
 #if defined(__x86_64__)
+   #define bpf_target_x86
+#elif defined(__s390x__)
+   #define bpf_target_s930x
+#elif defined(__aarch64__)
+   #define bpf_target_arm64
+#elif defined(__powerpc__)
+   #define bpf_target_powerpc
+#elif defined(__sparc__)
+   #define bpf_target_sparc
+#endif
+#endif
+
+#if defined(bpf_target_x86)
 
 #define PT_REGS_PARM1(x) ((x)->di)
 #define PT_REGS_PARM2(x) ((x)->si)
@@ -109,7 +144,7 @@ static int (*bpf_skb_change_head)(void *, int len, int 
flags) =
 #define PT_REGS_SP(x) ((x)->sp)
 #define PT_REGS_IP(x) ((x)->ip)
 
-#elif defined(__s390x__)
+#elif defined(bpf_target_s390x)
 
 #define PT_REGS_PARM1(x) ((x)->gprs[2])
 #define PT_REGS_PARM2(x) ((x)->gprs[3])
@@ -122,7 +157,7 @@ static int (*bpf_skb_change_head)(void *, int len, int 
flags) =
 #define PT_REGS_SP(x) ((x)->gprs[15])
 #define PT_REGS_IP(x) ((x)->psw.addr)
 
-#elif defined(__aarch64__)
+#elif defined(bpf_target_arm64)
 
 #define PT_REGS_PARM1(x) ((x)->regs[0])
 #define PT_REGS_PARM2(x) ((x)->regs[1])
@@ -135,7 +170,7 @@ static int (*bpf_skb_change_head)(void *, int len, int 
flags) =
 #define PT_REGS_SP(x) ((x)->sp)
 #define PT_REGS_IP(x) ((x)->pc)
 
-#elif defined(__powerpc__)
+#elif defined(bpf_target_powerpc)
 
 #define PT_REGS_PARM1(x) ((x)->gpr[3])
 #define PT_REGS_PARM2(x) ((x)->gpr[4])
@@ -146,7 +181,7 @@ static int (*bpf_skb_change_head)(void *, int len, int 
flags) =
 #define PT_REGS_SP(x) ((x)->sp)
 #define PT_REGS_IP(x) ((x)->nip)
 
-#elif defined(__sparc__)
+#elif defined(bpf_target_sparc)
 
 #define PT_REGS_PARM1(x) ((x)->u_regs[UREG_I0])
 #define PT_REGS_PARM2(x) ((x)->u_regs[UREG_I1])
@@ -156,6 +191,8 @@ static int (*bpf_skb_change_head)(void *, int len, int 
flags) =
 #define PT_REGS_RET(x) ((x)->u_regs[UREG_I7])
 #define PT_REGS_RC(x) ((x)->u_regs[UREG_I0])
 #define PT_REGS_SP(x) ((x)->u_regs[UREG_FP])
+
+/* Should this also be a bpf_target check for the sparc case? */
 #if defined(__arch64__)
 #define PT_REGS_IP(x) ((x)->tpc)
 #else
@@ -164,10 +201,10 @@ static int (*bpf_skb_change_head)(void *, int len, int 
flags) =
 
 #endif
 
-#ifdef __powerpc__
+#ifdef bpf_target_powerpc
 #define BPF_KPROBE_READ_RET_IP(ip, ctx)({ (ip) = (ctx)->link; 
})
 #define BPF_KRETPROBE_READ_RET_IP  BPF_KPROBE_READ_RET_IP
-#elif defined(__sparc__)
+#elif bpf_target_sparc
 #define BPF_KPROBE_READ_RET_IP(ip, ctx)({ (ip) = 
PT_REGS_RET(ctx); })
 #define BPF_KRETPROBE_READ_RET_IP  BPF_KPROBE_READ_RET_IP
 #else
-- 
2.14.0.rc1.383.gd1ce394fe2-goog



[PATCH RFC 2/5] samples/bpf: Enable cross compiler support

2017-08-07 Thread Joel Fernandes
When cross compiling, bpf samples use HOSTCC, however what we really want is to
use the cross compiler to build for the cross target since that is what will
help run the BPF target code.  Detect this and also set -static as LDFLAGS
since often times we don't have control over what C library the cross target is
running and its not smart to rely on it.

Signed-off-by: Joel Fernandes 
---
 samples/bpf/Makefile | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 6c7468eb3684..e5642c8c144d 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -152,6 +152,12 @@ HOSTLOADLIBES_test_map_in_map += -lelf
 LLC ?= llc
 CLANG ?= clang
 
+# Detect that we're cross compiling and use the right compilers and flags
+ifdef CROSS_COMPILE
+HOSTCC = $(CROSS_COMPILE)gcc
+HOSTLDFLAGS += -static
+endif
+
 # Trick to allow make to be run from this directory
 all:
$(MAKE) -C ../../ $(CURDIR)/
-- 
2.14.0.rc1.383.gd1ce394fe2-goog



[PATCH RFC 3/5] samples/bpf: Fix inline asm issues building samples on arm64

2017-08-07 Thread Joel Fernandes
inline assembly has haunted building samples on arm64 for quite sometime.
This patch uses the pre-processor to noop all occurences of inline asm when
compiling the BPF sample for the BPF target.

This patch reintroduces inclusion of asm/sysregs.h which needs to be included
to avoid compiler errors now, see [1]. Previously a hack prevented this
inclusion [2] (to avoid the exact problem this patch fixes - skipping inline
assembler) but the hack causes other errors now and no longer works.

Using the preprocessor to noop the inline asm occurences, we also avoid
any future unstable hackery needed (such as those that skip asm headers)
and provides information that asm headers may have which could have been
used but which the inline asm issues prevented. This is the least messy
of all hacks in my opinion.

[1] https://lkml.org/lkml/2017/8/5/143
[2] https://lists.linaro.org/pipermail/linaro-kernel/2015-November/024036.html

Signed-off-by: Joel Fernandes 
---
 samples/bpf/Makefile   | 40 +---
 samples/bpf/arm64_asmstubs.h   |  3 +++
 samples/bpf/bpf_helpers.h  | 12 ++--
 samples/bpf/generic_asmstubs.h |  4 
 4 files changed, 46 insertions(+), 13 deletions(-)
 create mode 100644 samples/bpf/arm64_asmstubs.h
 create mode 100644 samples/bpf/generic_asmstubs.h

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index e5642c8c144d..7591cdd7fe69 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -151,6 +151,8 @@ HOSTLOADLIBES_test_map_in_map += -lelf
 #  make samples/bpf/ LLC=~/git/llvm/build/bin/llc 
CLANG=~/git/llvm/build/bin/clang
 LLC ?= llc
 CLANG ?= clang
+PERL ?= perl
+RM ?= rm
 
 # Detect that we're cross compiling and use the right compilers and flags
 ifdef CROSS_COMPILE
@@ -186,14 +188,38 @@ verify_target_bpf: verify_cmds
 
 $(src)/*.c: verify_target_bpf
 
-# asm/sysreg.h - inline assembly used by it is incompatible with llvm.
-# But, there is no easy way to fix it, so just exclude it since it is
-# useless for BPF samples.
-$(obj)/%.o: $(src)/%.c
-   $(CLANG) $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(EXTRA_CFLAGS) \
-   -D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value 
-Wno-pointer-sign \
+curdir := $(shell dirname $(realpath $(lastword $(MAKEFILE_LIST
+ifeq ($(wildcard  $(curdir)/${ARCH}_asmstubs.h),)
+ARCH_ASM_STUBS :=
+else
+ARCH_ASM_STUBS := -include $(src)/${ARCH}_asmstubs.h
+endif
+
+ASM_STUBS := ${ARCH_ASM_STUBS} -include $(src)/generic_asmstubs.h
+
+CLANG_ARGS = $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(EXTRA_CFLAGS) \
+   -D__KERNEL__ -Wno-unused-value -Wno-pointer-sign \
+   $(ASM_STUBS) \
-Wno-compare-distinct-pointer-types \
-Wno-gnu-variable-sized-type-not-at-end \
-Wno-address-of-packed-member -Wno-tautological-compare \
-Wno-unknown-warning-option \
-   -O2 -emit-llvm -c $< -o -| $(LLC) -march=bpf -filetype=obj -o $@
+   -O2 -emit-llvm
+
+$(obj)/%.o: $(src)/%.c
+   # Steps to compile BPF sample while getting rid of inline asm
+   # This has the advantage of not having to skip important asm headers
+   # Step 1. Use clang preprocessor to stub out asm() calls
+   # Step 2. Replace all "asm volatile" with single keyword "asmvolatile"
+   # Step 3. Use clang preprocessor to noop all asm volatile() calls
+   # and restore asm_bpf to asm for BPF's asm directives
+   # Step 4. Compile and link
+
+   $(CLANG) -E $(CLANG_ARGS) -c $< -o - | \
+   $(PERL) -pe "s/[_\s]*asm[_\s]*volatile[_\s]*/asmvolatile/g" | \
+   $(CLANG) -E $(ASM_STUBS) - -o - | \
+   $(CLANG) -E -Dasm_bpf=asm - -o $@.tmp.c
+
+   $(CLANG) $(CLANG_ARGS) -c $@.tmp.c \
+   -o - | $(LLC) -march=bpf -filetype=obj -o $@
+   $(RM) $@.tmp.c
diff --git a/samples/bpf/arm64_asmstubs.h b/samples/bpf/arm64_asmstubs.h
new file mode 100644
index ..23d47dbe61b1
--- /dev/null
+++ b/samples/bpf/arm64_asmstubs.h
@@ -0,0 +1,3 @@
+/* Special handing for current_stack_pointer */
+#define __ASM_STACK_POINTER_H
+#define current_stack_pointer 0
diff --git a/samples/bpf/bpf_helpers.h b/samples/bpf/bpf_helpers.h
index 9a9c95f2c9fb..67c9c4438e4b 100644
--- a/samples/bpf/bpf_helpers.h
+++ b/samples/bpf/bpf_helpers.h
@@ -64,12 +64,12 @@ static int (*bpf_xdp_adjust_head)(void *ctx, int offset) =
  * emit BPF_LD_ABS and BPF_LD_IND instructions
  */
 struct sk_buff;
-unsigned long long load_byte(void *skb,
-unsigned long long off) asm("llvm.bpf.load.byte");
-unsigned long long load_half(void *skb,
-unsigned long long off) asm("llvm.bpf.load.half");
-unsigned long long load_word(void *skb,
-unsigned long long off) asm("llvm.bpf.load.word");
+unsigned long long load_byte(void *skb, unsigned long long off)
+ 

[PATCH RFC 4/5] samples/bpf: Fix pt_regs issues when cross-compiling

2017-08-07 Thread Joel Fernandes
BPF samples fail to build when cross-compiling for ARM64 because of incorrect
pt_regs param selection. This is because clang defines __x86_64__ and
bpf_headers thinks we're building for x86. Since clang is building for the BPF
target, it shouldn't make assumptions about what target the BPF program is
going to run on. To fix this, lets pass ARCH so the header knows which target
the BPF program is being compiled for and can use the correct pt_regs code.

Signed-off-by: Joel Fernandes 
---
 samples/bpf/Makefile  |  2 +-
 samples/bpf/bpf_helpers.h | 49 +--
 2 files changed, 44 insertions(+), 7 deletions(-)

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 7591cdd7fe69..8cbcaffe4001 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -199,7 +199,7 @@ ASM_STUBS := ${ARCH_ASM_STUBS} -include 
$(src)/generic_asmstubs.h
 
 CLANG_ARGS = $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(EXTRA_CFLAGS) \
-D__KERNEL__ -Wno-unused-value -Wno-pointer-sign \
-   $(ASM_STUBS) \
+   -D__TARGET_ARCH_$(ARCH) $(ASM_STUBS) \
-Wno-compare-distinct-pointer-types \
-Wno-gnu-variable-sized-type-not-at-end \
-Wno-address-of-packed-member -Wno-tautological-compare \
diff --git a/samples/bpf/bpf_helpers.h b/samples/bpf/bpf_helpers.h
index 67c9c4438e4b..199d2e32703a 100644
--- a/samples/bpf/bpf_helpers.h
+++ b/samples/bpf/bpf_helpers.h
@@ -96,7 +96,42 @@ static int (*bpf_skb_under_cgroup)(void *ctx, void *map, int 
index) =
 static int (*bpf_skb_change_head)(void *, int len, int flags) =
(void *) BPF_FUNC_skb_change_head;
 
+/* Scan the ARCH passed in from ARCH env variable (see Makefile) */
+#if defined(__TARGET_ARCH_x86)
+   #define bpf_target_x86
+   #define bpf_target_defined
+#elif defined(__TARGET_ARCH_s930x)
+   #define bpf_target_s930x
+   #define bpf_target_defined
+#elif defined(__TARGET_ARCH_arm64)
+   #define bpf_target_arm64
+   #define bpf_target_defined
+#elif defined(__TARGET_ARCH_powerpc)
+   #define bpf_target_powerpc
+   #define bpf_target_defined
+#elif defined(__TARGET_ARCH_sparc)
+   #define bpf_target_sparc
+   #define bpf_target_defined
+#else
+   #undef bpf_target_defined
+#endif
+
+/* Fall back to what the compiler says */
+#ifndef bpf_target_defined
 #if defined(__x86_64__)
+   #define bpf_target_x86
+#elif defined(__s390x__)
+   #define bpf_target_s930x
+#elif defined(__aarch64__)
+   #define bpf_target_arm64
+#elif defined(__powerpc__)
+   #define bpf_target_powerpc
+#elif defined(__sparc__)
+   #define bpf_target_sparc
+#endif
+#endif
+
+#if defined(bpf_target_x86)
 
 #define PT_REGS_PARM1(x) ((x)->di)
 #define PT_REGS_PARM2(x) ((x)->si)
@@ -109,7 +144,7 @@ static int (*bpf_skb_change_head)(void *, int len, int 
flags) =
 #define PT_REGS_SP(x) ((x)->sp)
 #define PT_REGS_IP(x) ((x)->ip)
 
-#elif defined(__s390x__)
+#elif defined(bpf_target_s390x)
 
 #define PT_REGS_PARM1(x) ((x)->gprs[2])
 #define PT_REGS_PARM2(x) ((x)->gprs[3])
@@ -122,7 +157,7 @@ static int (*bpf_skb_change_head)(void *, int len, int 
flags) =
 #define PT_REGS_SP(x) ((x)->gprs[15])
 #define PT_REGS_IP(x) ((x)->psw.addr)
 
-#elif defined(__aarch64__)
+#elif defined(bpf_target_arm64)
 
 #define PT_REGS_PARM1(x) ((x)->regs[0])
 #define PT_REGS_PARM2(x) ((x)->regs[1])
@@ -135,7 +170,7 @@ static int (*bpf_skb_change_head)(void *, int len, int 
flags) =
 #define PT_REGS_SP(x) ((x)->sp)
 #define PT_REGS_IP(x) ((x)->pc)
 
-#elif defined(__powerpc__)
+#elif defined(bpf_target_powerpc)
 
 #define PT_REGS_PARM1(x) ((x)->gpr[3])
 #define PT_REGS_PARM2(x) ((x)->gpr[4])
@@ -146,7 +181,7 @@ static int (*bpf_skb_change_head)(void *, int len, int 
flags) =
 #define PT_REGS_SP(x) ((x)->sp)
 #define PT_REGS_IP(x) ((x)->nip)
 
-#elif defined(__sparc__)
+#elif defined(bpf_target_sparc)
 
 #define PT_REGS_PARM1(x) ((x)->u_regs[UREG_I0])
 #define PT_REGS_PARM2(x) ((x)->u_regs[UREG_I1])
@@ -156,6 +191,8 @@ static int (*bpf_skb_change_head)(void *, int len, int 
flags) =
 #define PT_REGS_RET(x) ((x)->u_regs[UREG_I7])
 #define PT_REGS_RC(x) ((x)->u_regs[UREG_I0])
 #define PT_REGS_SP(x) ((x)->u_regs[UREG_FP])
+
+/* Should this also be a bpf_target check for the sparc case? */
 #if defined(__arch64__)
 #define PT_REGS_IP(x) ((x)->tpc)
 #else
@@ -164,10 +201,10 @@ static int (*bpf_skb_change_head)(void *, int len, int 
flags) =
 
 #endif
 
-#ifdef __powerpc__
+#ifdef bpf_target_powerpc
 #define BPF_KPROBE_READ_RET_IP(ip, ctx)({ (ip) = (ctx)->link; 
})
 #define BPF_KRETPROBE_READ_RET_IP  BPF_KPROBE_READ_RET_IP
-#elif defined(__sparc__)
+#elif bpf_target_sparc
 #define BPF_KPROBE_READ_RET_IP(ip, ctx)({ (ip) = 
PT_REGS_RET(ctx); })
 #define BPF_KRETPROBE_READ_RET_IP  BPF_KPROBE_READ_RET_IP
 #else
-- 
2.14.0.rc1.383.gd1ce394fe2-goog



[PATCH RFC 1/5] samples/bpf: Use getppid instead of getpgrp for array map stress

2017-08-07 Thread Joel Fernandes
When cross-compiling the bpf sample map_perf_test for aarch64, I find that
__NR_getpgrp is undefined. This causes build errors. This syscall is deprecated
and requires defining __ARCH_WANT_SYSCALL_DEPRECATED. To avoid having to define
that, just use a different syscall (getppid) for the array map stress test.

Signed-off-by: Joel Fernandes 
---
 samples/bpf/map_perf_test_kern.c | 2 +-
 samples/bpf/map_perf_test_user.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/samples/bpf/map_perf_test_kern.c b/samples/bpf/map_perf_test_kern.c
index 245165817fbe..038ffec295cf 100644
--- a/samples/bpf/map_perf_test_kern.c
+++ b/samples/bpf/map_perf_test_kern.c
@@ -232,7 +232,7 @@ int stress_hash_map_lookup(struct pt_regs *ctx)
return 0;
 }
 
-SEC("kprobe/sys_getpgrp")
+SEC("kprobe/sys_getppid")
 int stress_array_map_lookup(struct pt_regs *ctx)
 {
u32 key = 1, i;
diff --git a/samples/bpf/map_perf_test_user.c b/samples/bpf/map_perf_test_user.c
index 1a8894b5ac51..1e9e68942197 100644
--- a/samples/bpf/map_perf_test_user.c
+++ b/samples/bpf/map_perf_test_user.c
@@ -232,7 +232,7 @@ static void test_array_lookup(int cpu)
 
start_time = time_get_ns();
for (i = 0; i < max_cnt; i++)
-   syscall(__NR_getpgrp, 0);
+   syscall(__NR_getppid, 0);
printf("%d:array_lookup %lld lookups per sec\n",
   cpu, max_cnt * 10ll * 64 / (time_get_ns() - start_time));
 }
-- 
2.14.0.rc1.383.gd1ce394fe2-goog



[PATCH RFC 5/5] samples/bpf: Add documentation on cross compilation

2017-08-07 Thread Joel Fernandes
Signed-off-by: Joel Fernandes 
---
 samples/bpf/README.rst | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/samples/bpf/README.rst b/samples/bpf/README.rst
index 79f9a58f1872..2b906127ef54 100644
--- a/samples/bpf/README.rst
+++ b/samples/bpf/README.rst
@@ -64,3 +64,13 @@ It is also possible to point make to the newly compiled 
'llc' or
 'clang' command via redefining LLC or CLANG on the make command line::
 
  make samples/bpf/ LLC=~/git/llvm/build/bin/llc 
CLANG=~/git/llvm/build/bin/clang
+
+Cross compiling samples
+---
+Inorder to cross-compile, say for arm64 targets, export CROSS_COMPILE and ARCH
+environment variables before calling make. This will direct make to build
+samples for the cross target.
+
+export ARCH=arm64
+export CROSS_COMPILE="aarch64-linux-gnu-"
+make samples/bpf/ LLC=~/git/llvm/build/bin/llc CLANG=~/git/llvm/build/bin/clang
-- 
2.14.0.rc1.383.gd1ce394fe2-goog



Re: [PATCH] samples/bpf: Fix cross compiler error with bpf sample

2017-08-05 Thread Joel Fernandes
Hi Daniel,


>>>
>>> So the only arch that sets __ARCH_WANT_SYSCALL_DEPRECATED
>>> is score:
>>>
>>>$ git grep -n __ARCH_WANT_SYSCALL_DEPRECATED
>>>arch/score/include/uapi/asm/unistd.h:7:#define
>>> __ARCH_WANT_SYSCALL_DEPRECATED
>>>include/uapi/asm-generic/unistd.h:837:#ifdef
>>> __ARCH_WANT_SYSCALL_DEPRECATED
>>>include/uapi/asm-generic/unistd.h:899:#endif /*
>>> __ARCH_WANT_SYSCALL_DEPRECATED */
>>>
>>> But even if this would make aarch64 compile, the syscall
>>> numbers don't match up:
>>>
>>>$ git grep -n __NR_getpgrp include/uapi/asm-generic/unistd.h
>>>include/uapi/asm-generic/unistd.h:841:#define __NR_getpgrp 1060
>>>include/uapi/asm-generic/unistd.h:843:__SYSCALL(__NR_getpgrp,
>>> sys_getpgrp)
>>>
>>> The only thing that can be found on arm64 is:
>>>
>>>$ git grep -n __NR_getpgrp arch/arm64/
>>>arch/arm64/include/asm/unistd32.h:154:#define __NR_getpgrp 65
>>>arch/arm64/include/asm/unistd32.h:155:__SYSCALL(__NR_getpgrp,
>>> sys_getpgrp)
>>>
>>> In arch/arm64/include/asm/unistd.h, it does include the
>>> uapi/asm/unistd.h when compat is not set, but without the
>>> __ARCH_WANT_SYSCALL_DEPRECATED. That doesn't look correct
>>> unless I'm missing something, hmm, can't we just attach the
>>> kprobes to a different syscall, one that is not deprecated,
>>> so that we don't run into this in the first place?
>>
>>
>> Yes, I agree that's better. I think we can use getpgid. I'll try to
>> whip something today and send it out.

So switching to getppid fixes it, however most of the _kern.o BPF
samples still don't build on aarch64 for me.

The trouble is with inline assembler statements in the ARM64 asm
include headers (which I think upsets LLVM/clang when building for the
BPF target).

Also about skipping headers to fix this, currently we pass
-D__ASM_SYSREG_H from samples/bpf/Makefile so that inline assembler
statements are skipped (this was added in 2015 to this file). However,
not including this causes build errors at the moment. So as sysreg.h
isn't included, the SYS_MIDR_EL1 macro from
arch/arm64/include/asm/sysreg.h is missing causing this build error.
This caused by an include originating from skbuff.h :

In file included from samples/bpf/map_perf_test_kern.c:7:
In file included from ./include/linux/skbuff.h:17:
In file included from ./include/linux/kernel.h:13:
In file included from ./include/linux/printk.h:8:
In file included from ./include/linux/cache.h:5:
In file included from ./arch/arm64/include/asm/cache.h:19:
./arch/arm64/include/asm/cputype.h:119:9: error: use of undeclared
identifier 'SYS_MPIDR_EL1'
return read_cpuid(MPIDR_EL1);
   ^

There are other paths too like mm.h including asm/pgtable.h which
breaks because of the inline assembler issues.

In file included from samples/bpf/map_perf_test_kern.c:7:
In file included from ./include/linux/skbuff.h:34:
In file included from ./include/linux/dma-mapping.h:10:
In file included from ./include/linux/scatterlist.h:7:
In file included from ./include/linux/mm.h:70:
./arch/arm64/include/asm/pgtable.h:607:9: error: value
'18446744073709550591' out of range for constraint 'L'
: "L" (~PTE_AF), "I" (ilog2(PTE_AF)));


I believe the main problem here is we are trying to build for BPF
target while including ARM64 asm headers (which I think we can't
avoid?). I was thinking of just setting ARCH to x86 when compiling the
_kern.o parts of the bpf samples. However that doesn't make sense as
we probably need to be arch aware in the samples?

I am at a loss at the moment for ideas. I would love to get any other
ideas - how does x86 solve this issue?. I tried doing something like:
skip as many asm headers using the -D trick,  passing "-include
" and defining as many things as I can in stubs.h as noops.
However I got stuck with that idea when I hit the above mm build
error.

I am also CC'ing some ARM64 folks as well for any thoughts, I am
looking forward to getting the samples working on my arm64 platform.

Thanks!

Best,

-Joel




>
> Ok, cool. Please make sure that this doesn't clash with anything
> else attached to map_perf_test_kern.c already given the obj file
> is loaded first with the attachment points.
>
>> I also wanted to fix something else, HOSTCC is set to gcc, but I want
>> the boostrap part of the sample to run an ARM so I have to make HOSTCC
>> my cross-compiler. Right now I'm hacking it to point to the arm64 gcc
>> however I think I'd like to add a 'cross compile mode' or something
>> whether HOSTCC points to CROSS_COMPILE instead. I'm happy to discuss
>> any ideas to get this fixed too.
>
>
> Yeah, sounds like a good idea to add such possibility. In case of
> cross compiling to a target arch with different endianess, you might
> also need to specifically select bpfeb (big endian) resp. bpfel
> (little endian) as clang target. (Just bpf target uses host endianess.)
>
> Thanks, Joel!


Re: [PATCH] samples/bpf: Fix cross compiler error with bpf sample

2017-08-04 Thread Joel Fernandes
On Fri, Aug 4, 2017 at 6:58 AM, Daniel Borkmann  wrote:
> On 08/04/2017 07:46 AM, Joel Fernandes wrote:
>>
>> When cross-compiling the bpf sample map_perf_test for aarch64, I find that
>> __NR_getpgrp is undefined. This causes build errors. Fix it by allowing
>> the
>> deprecated syscall in the sample.
>>
>> Signed-off-by: Joel Fernandes 
>> ---
>>   samples/bpf/map_perf_test_user.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/samples/bpf/map_perf_test_user.c
>> b/samples/bpf/map_perf_test_user.c
>> index 1a8894b5ac51..6e6fc7121640 100644
>> --- a/samples/bpf/map_perf_test_user.c
>> +++ b/samples/bpf/map_perf_test_user.c
>> @@ -8,7 +8,9 @@
>>   #include 
>>   #include 
>>   #include 
>> +#define __ARCH_WANT_SYSCALL_DEPRECATED
>>   #include 
>> +#undef __ARCH_WANT_SYSCALL_DEPRECATED
>
>
> So the only arch that sets __ARCH_WANT_SYSCALL_DEPRECATED
> is score:
>
>   $ git grep -n __ARCH_WANT_SYSCALL_DEPRECATED
>   arch/score/include/uapi/asm/unistd.h:7:#define
> __ARCH_WANT_SYSCALL_DEPRECATED
>   include/uapi/asm-generic/unistd.h:837:#ifdef
> __ARCH_WANT_SYSCALL_DEPRECATED
>   include/uapi/asm-generic/unistd.h:899:#endif /*
> __ARCH_WANT_SYSCALL_DEPRECATED */
>
> But even if this would make aarch64 compile, the syscall
> numbers don't match up:
>
>   $ git grep -n __NR_getpgrp include/uapi/asm-generic/unistd.h
>   include/uapi/asm-generic/unistd.h:841:#define __NR_getpgrp 1060
>   include/uapi/asm-generic/unistd.h:843:__SYSCALL(__NR_getpgrp, sys_getpgrp)
>
> The only thing that can be found on arm64 is:
>
>   $ git grep -n __NR_getpgrp arch/arm64/
>   arch/arm64/include/asm/unistd32.h:154:#define __NR_getpgrp 65
>   arch/arm64/include/asm/unistd32.h:155:__SYSCALL(__NR_getpgrp, sys_getpgrp)
>
> In arch/arm64/include/asm/unistd.h, it does include the
> uapi/asm/unistd.h when compat is not set, but without the
> __ARCH_WANT_SYSCALL_DEPRECATED. That doesn't look correct
> unless I'm missing something, hmm, can't we just attach the
> kprobes to a different syscall, one that is not deprecated,
> so that we don't run into this in the first place?

Yes, I agree that's better. I think we can use getpgid. I'll try to
whip something today and send it out.

I also wanted to fix something else, HOSTCC is set to gcc, but I want
the boostrap part of the sample to run an ARM so I have to make HOSTCC
my cross-compiler. Right now I'm hacking it to point to the arm64 gcc
however I think I'd like to add a 'cross compile mode' or something
whether HOSTCC points to CROSS_COMPILE instead. I'm happy to discuss
any ideas to get this fixed too.

thanks!

-Joel


>
> Thanks,
> Daniel


[PATCH] samples/bpf: Fix cross compiler error with bpf sample

2017-08-03 Thread Joel Fernandes
When cross-compiling the bpf sample map_perf_test for aarch64, I find that
__NR_getpgrp is undefined. This causes build errors. Fix it by allowing the
deprecated syscall in the sample.

Signed-off-by: Joel Fernandes 
---
 samples/bpf/map_perf_test_user.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/samples/bpf/map_perf_test_user.c b/samples/bpf/map_perf_test_user.c
index 1a8894b5ac51..6e6fc7121640 100644
--- a/samples/bpf/map_perf_test_user.c
+++ b/samples/bpf/map_perf_test_user.c
@@ -8,7 +8,9 @@
 #include 
 #include 
 #include 
+#define __ARCH_WANT_SYSCALL_DEPRECATED
 #include 
+#undef __ARCH_WANT_SYSCALL_DEPRECATED
 #include 
 #include 
 #include 
-- 
2.14.0.rc1.383.gd1ce394fe2-goog