Re: wireguard: unknown relocation: 102 [ARMv7 Thumb-2]

2020-06-19 Thread Rui Salvaterra
Good morning, Jason!

On Fri, 19 Jun 2020 at 00:50, Jason A. Donenfeld  wrote:
>
> Hey Rui,
>
> I fixed it! It turned out to be caused by -fvisibility=hidden undoing
> the effect of the binutils fix from a while back. Here's the patch
> that makes the problem go away:
>
> https://git.zx2c4.com/wireguard-linux-compat/commit/?id=178cdfffb99f2fd6fb4a5bfd2f9319461d93f53b
>
> This will be in the next compat release.
>
> Jason

Great detective work there too! :) I do have to wonder if this is
really a binutils/gas bug, though. From what I could gather, it's only
the kernel module loader which can't (and won't, I remember reading
somewhere they don't make sense for the kernel) resolve
R_ARM_THM_JUMP11 relocations, and using -fvisibility=hidden in a
kernel module seems to send the linker a conflicting message. Anyway,
I'd still open that bug, at least to get a definitive answer. ;)

Thanks a lot,
Rui


Re: wireguard: unknown relocation: 102 [ARMv7 Thumb-2]

2020-06-18 Thread Jason A. Donenfeld
Hey Rui,

I fixed it! It turned out to be caused by -fvisibility=hidden undoing
the effect of the binutils fix from a while back. Here's the patch
that makes the problem go away:

https://git.zx2c4.com/wireguard-linux-compat/commit/?id=178cdfffb99f2fd6fb4a5bfd2f9319461d93f53b

This will be in the next compat release.

Jason


Re: wireguard: unknown relocation: 102 [ARMv7 Thumb-2]

2020-06-17 Thread Jason A. Donenfeld
On Wed, Jun 17, 2020 at 02:33:49PM -0600, Jason A. Donenfeld wrote:
> So, some more research: it looks like the R_ARM_THM_JUMP11 symbol is
> actually wg_packet_send_staged_packets, a boring C function with
> nothing fancy about it. That github issue you pointed to suggested
> that it might have something to do with complex crypto functions, but
> it looks like that's not the case. wg_packet_send_staged_packets is
> plain old boring C.
> 
> But there is one interesting thing about
> wg_packet_send_staged_packets: it's defined in send.c, and called from
> send.c, receive.c, device.c, and netlink.c -- four places. What I
> suspect is happening is that the linker can't quite figure out how to
> order the functions in the final executable so that the
> wg_packet_send_staged_packets definition is sufficiently close to all
> of its call sites, so it then needs to add that extra trampoline
> midway to get to it. Stupid linker. I'm playing now if there's some
> manual reordering I can do in the build system so that this isn't a
> problem, but I'm not very optimistic that I'll succeed.

Looks like my explanation there wasn't 100% accurate, but it does seem
like the issue occurs when gcc sees a clear tail call that it can
optimize into a B instruction instead of a BL instruction.

The below patch avoids that, and thus fixes your issue, using a pretty
bad trick that's not really suitable for being committed anywhere, but
it is perhaps leading us in the right direction:

diff --git a/src/send.c b/src/send.c
index 828b086a..4bb6911f 100644
--- a/src/send.c
+++ b/src/send.c
@@ -221,6 +221,8 @@ static bool encrypt_packet(struct sk_buff *skb, struct 
noise_keypair *keypair,
     simd_context);
 }
 
+volatile char dummy;
+
 void wg_packet_send_keepalive(struct wg_peer *peer)
 {
  struct sk_buff *skb;
@@ -240,6 +242,7 @@ void wg_packet_send_keepalive(struct wg_peer *peer)
  }
 
  wg_packet_send_staged_packets(peer);
+ dummy = -1;
 }
 
 static void wg_packet_create_data_done(struct sk_buff *first,


Re: wireguard: unknown relocation: 102 [ARMv7 Thumb-2]

2020-06-17 Thread Jason A. Donenfeld
On Wed, Jun 17, 2020 at 02:45:12PM -0600, Jason A. Donenfeld wrote:
> Looks like my explanation there wasn't 100% accurate, but it does seem
> like the issue occurs when gcc sees a clear tail call that it can
> optimize into a B instruction instead of a BL instruction.
> 
> The below patch avoids that, and thus fixes your issue, using a pretty
> bad trick that's not really suitable for being committed anywhere, but
> it is perhaps leading us in the right direction:
> 
> diff --git a/src/send.c b/src/send.c
> index 828b086a..4bb6911f 100644
> --- a/src/send.c
> +++ b/src/send.c
> @@ -221,6 +221,8 @@ static bool encrypt_packet(struct sk_buff *skb, struct 
> noise_keypair *keypair,
>      simd_context);
>  }
>  
> +volatile char dummy;
> +
>  void wg_packet_send_keepalive(struct wg_peer *peer)
>  {
>   struct sk_buff *skb;
> @@ -240,6 +242,7 @@ void wg_packet_send_keepalive(struct wg_peer *peer)
>   }
>  
>   wg_packet_send_staged_packets(peer);
> + dummy = -1;
>  }
>  
>  static void wg_packet_create_data_done(struct sk_buff *first,

A better fix with more explanation: it looks like the issue doesn't have
to do with the multifile thing I pointed out before, but just that gcc
sees it can optimize the tail call into a B instruction, which seems to
have a ±2KB range, whereas BL has a ±4MB range. The solution is to just
move the location of the function in that file to be closer to the
destination of the tail call. I'm not a big fan of that and I'm slightly
worried davem will nack it because it makes backporting harder for a
fairly speculative gain (at least, I haven't yet taken measurements,
though I suppose I could). There's also the question of - why are we
doing goofy reordering things to the code to work around a toolchain
bug? Shouldn't we fix the toolchain? So, I'll keep thinking...

diff --git a/src/send.c b/src/send.c
index 828b086a..f44aff8d 100644
--- a/src/send.c
+++ b/src/send.c
@@ -221,27 +221,6 @@ static bool encrypt_packet(struct sk_buff *skb, struct 
noise_keypair *keypair,
   simd_context);
 }

-void wg_packet_send_keepalive(struct wg_peer *peer)
-{
-   struct sk_buff *skb;
-
-   if (skb_queue_empty(&peer->staged_packet_queue)) {
-   skb = alloc_skb(DATA_PACKET_HEAD_ROOM + MESSAGE_MINIMUM_LENGTH,
-   GFP_ATOMIC);
-   if (unlikely(!skb))
-   return;
-   skb_reserve(skb, DATA_PACKET_HEAD_ROOM);
-   skb->dev = peer->device->dev;
-   PACKET_CB(skb)->mtu = skb->dev->mtu;
-   skb_queue_tail(&peer->staged_packet_queue, skb);
-   net_dbg_ratelimited("%s: Sending keepalive packet to peer %llu 
(%pISpfsc)\n",
-   peer->device->dev->name, peer->internal_id,
-   &peer->endpoint.addr);
-   }
-
-   wg_packet_send_staged_packets(peer);
-}
-
 static void wg_packet_create_data_done(struct sk_buff *first,
   struct wg_peer *peer)
 {
@@ -346,6 +325,27 @@ err:
kfree_skb_list(first);
 }

+void wg_packet_send_keepalive(struct wg_peer *peer)
+{
+   struct sk_buff *skb;
+
+   if (skb_queue_empty(&peer->staged_packet_queue)) {
+   skb = alloc_skb(DATA_PACKET_HEAD_ROOM + MESSAGE_MINIMUM_LENGTH,
+   GFP_ATOMIC);
+   if (unlikely(!skb))
+   return;
+   skb_reserve(skb, DATA_PACKET_HEAD_ROOM);
+   skb->dev = peer->device->dev;
+   PACKET_CB(skb)->mtu = skb->dev->mtu;
+   skb_queue_tail(&peer->staged_packet_queue, skb);
+   net_dbg_ratelimited("%s: Sending keepalive packet to peer %llu 
(%pISpfsc)\n",
+   peer->device->dev->name, peer->internal_id,
+   &peer->endpoint.addr);
+   }
+
+   wg_packet_send_staged_packets(peer);
+}
+
 void wg_packet_purge_staged_packets(struct wg_peer *peer)
 {
spin_lock_bh(&peer->staged_packet_queue.lock);