Re: [OpenWrt-Devel] wireguard: unknown relocation: 102 [ARMv7 Thumb-2]
Good morning, Jason! On Fri, 19 Jun 2020 at 00:50, Jason A. Donenfeld wrote: > > Hey Rui, > > I fixed it! It turned out to be caused by -fvisibility=hidden undoing > the effect of the binutils fix from a while back. Here's the patch > that makes the problem go away: > > https://git.zx2c4.com/wireguard-linux-compat/commit/?id=178cdfffb99f2fd6fb4a5bfd2f9319461d93f53b > > This will be in the next compat release. > > Jason Great detective work there too! :) I do have to wonder if this is really a binutils/gas bug, though. From what I could gather, it's only the kernel module loader which can't (and won't, I remember reading somewhere they don't make sense for the kernel) resolve R_ARM_THM_JUMP11 relocations, and using -fvisibility=hidden in a kernel module seems to send the linker a conflicting message. Anyway, I'd still open that bug, at least to get a definitive answer. ;) Thanks a lot, Rui ___ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Re: [OpenWrt-Devel] wireguard: unknown relocation: 102 [ARMv7 Thumb-2]
Hey Rui, I fixed it! It turned out to be caused by -fvisibility=hidden undoing the effect of the binutils fix from a while back. Here's the patch that makes the problem go away: https://git.zx2c4.com/wireguard-linux-compat/commit/?id=178cdfffb99f2fd6fb4a5bfd2f9319461d93f53b This will be in the next compat release. Jason ___ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Re: [OpenWrt-Devel] wireguard: unknown relocation: 102 [ARMv7 Thumb-2]
On Wed, Jun 17, 2020 at 02:45:12PM -0600, Jason A. Donenfeld wrote: > Looks like my explanation there wasn't 100% accurate, but it does seem > like the issue occurs when gcc sees a clear tail call that it can > optimize into a B instruction instead of a BL instruction. > > The below patch avoids that, and thus fixes your issue, using a pretty > bad trick that's not really suitable for being committed anywhere, but > it is perhaps leading us in the right direction: > > diff --git a/src/send.c b/src/send.c > index 828b086a..4bb6911f 100644 > --- a/src/send.c > +++ b/src/send.c > @@ -221,6 +221,8 @@ static bool encrypt_packet(struct sk_buff *skb, struct > noise_keypair *keypair, > simd_context); > } > > +volatile char dummy; > + > void wg_packet_send_keepalive(struct wg_peer *peer) > { > struct sk_buff *skb; > @@ -240,6 +242,7 @@ void wg_packet_send_keepalive(struct wg_peer *peer) > } > > wg_packet_send_staged_packets(peer); > + dummy = -1; > } > > static void wg_packet_create_data_done(struct sk_buff *first, A better fix with more explanation: it looks like the issue doesn't have to do with the multifile thing I pointed out before, but just that gcc sees it can optimize the tail call into a B instruction, which seems to have a ±2KB range, whereas BL has a ±4MB range. The solution is to just move the location of the function in that file to be closer to the destination of the tail call. I'm not a big fan of that and I'm slightly worried davem will nack it because it makes backporting harder for a fairly speculative gain (at least, I haven't yet taken measurements, though I suppose I could). There's also the question of - why are we doing goofy reordering things to the code to work around a toolchain bug? Shouldn't we fix the toolchain? So, I'll keep thinking... diff --git a/src/send.c b/src/send.c index 828b086a..f44aff8d 100644 --- a/src/send.c +++ b/src/send.c @@ -221,27 +221,6 @@ static bool encrypt_packet(struct sk_buff *skb, struct noise_keypair *keypair, simd_context); } -void wg_packet_send_keepalive(struct wg_peer *peer) -{ - struct sk_buff *skb; - - if (skb_queue_empty(&peer->staged_packet_queue)) { - skb = alloc_skb(DATA_PACKET_HEAD_ROOM + MESSAGE_MINIMUM_LENGTH, - GFP_ATOMIC); - if (unlikely(!skb)) - return; - skb_reserve(skb, DATA_PACKET_HEAD_ROOM); - skb->dev = peer->device->dev; - PACKET_CB(skb)->mtu = skb->dev->mtu; - skb_queue_tail(&peer->staged_packet_queue, skb); - net_dbg_ratelimited("%s: Sending keepalive packet to peer %llu (%pISpfsc)\n", - peer->device->dev->name, peer->internal_id, - &peer->endpoint.addr); - } - - wg_packet_send_staged_packets(peer); -} - static void wg_packet_create_data_done(struct sk_buff *first, struct wg_peer *peer) { @@ -346,6 +325,27 @@ err: kfree_skb_list(first); } +void wg_packet_send_keepalive(struct wg_peer *peer) +{ + struct sk_buff *skb; + + if (skb_queue_empty(&peer->staged_packet_queue)) { + skb = alloc_skb(DATA_PACKET_HEAD_ROOM + MESSAGE_MINIMUM_LENGTH, + GFP_ATOMIC); + if (unlikely(!skb)) + return; + skb_reserve(skb, DATA_PACKET_HEAD_ROOM); + skb->dev = peer->device->dev; + PACKET_CB(skb)->mtu = skb->dev->mtu; + skb_queue_tail(&peer->staged_packet_queue, skb); + net_dbg_ratelimited("%s: Sending keepalive packet to peer %llu (%pISpfsc)\n", + peer->device->dev->name, peer->internal_id, + &peer->endpoint.addr); + } + + wg_packet_send_staged_packets(peer); +} + void wg_packet_purge_staged_packets(struct wg_peer *peer) { spin_lock_bh(&peer->staged_packet_queue.lock); ___ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Re: [OpenWrt-Devel] wireguard: unknown relocation: 102 [ARMv7 Thumb-2]
On Wed, Jun 17, 2020 at 02:33:49PM -0600, Jason A. Donenfeld wrote: > So, some more research: it looks like the R_ARM_THM_JUMP11 symbol is > actually wg_packet_send_staged_packets, a boring C function with > nothing fancy about it. That github issue you pointed to suggested > that it might have something to do with complex crypto functions, but > it looks like that's not the case. wg_packet_send_staged_packets is > plain old boring C. > > But there is one interesting thing about > wg_packet_send_staged_packets: it's defined in send.c, and called from > send.c, receive.c, device.c, and netlink.c -- four places. What I > suspect is happening is that the linker can't quite figure out how to > order the functions in the final executable so that the > wg_packet_send_staged_packets definition is sufficiently close to all > of its call sites, so it then needs to add that extra trampoline > midway to get to it. Stupid linker. I'm playing now if there's some > manual reordering I can do in the build system so that this isn't a > problem, but I'm not very optimistic that I'll succeed. Looks like my explanation there wasn't 100% accurate, but it does seem like the issue occurs when gcc sees a clear tail call that it can optimize into a B instruction instead of a BL instruction. The below patch avoids that, and thus fixes your issue, using a pretty bad trick that's not really suitable for being committed anywhere, but it is perhaps leading us in the right direction: diff --git a/src/send.c b/src/send.c index 828b086a..4bb6911f 100644 --- a/src/send.c +++ b/src/send.c @@ -221,6 +221,8 @@ static bool encrypt_packet(struct sk_buff *skb, struct noise_keypair *keypair, simd_context); } +volatile char dummy; + void wg_packet_send_keepalive(struct wg_peer *peer) { struct sk_buff *skb; @@ -240,6 +242,7 @@ void wg_packet_send_keepalive(struct wg_peer *peer) } wg_packet_send_staged_packets(peer); + dummy = -1; } static void wg_packet_create_data_done(struct sk_buff *first, ___ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Re: [OpenWrt-Devel] wireguard: unknown relocation: 102 [ARMv7 Thumb-2]
So, some more research: it looks like the R_ARM_THM_JUMP11 symbol is actually wg_packet_send_staged_packets, a boring C function with nothing fancy about it. That github issue you pointed to suggested that it might have something to do with complex crypto functions, but it looks like that's not the case. wg_packet_send_staged_packets is plain old boring C. But there is one interesting thing about wg_packet_send_staged_packets: it's defined in send.c, and called from send.c, receive.c, device.c, and netlink.c -- four places. What I suspect is happening is that the linker can't quite figure out how to order the functions in the final executable so that the wg_packet_send_staged_packets definition is sufficiently close to all of its call sites, so it then needs to add that extra trampoline midway to get to it. Stupid linker. I'm playing now if there's some manual reordering I can do in the build system so that this isn't a problem, but I'm not very optimistic that I'll succeed. ___ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Re: [OpenWrt-Devel] wireguard: unknown relocation: 102 [ARMv7 Thumb-2]
Hi Rui, On Wed, Jun 17, 2020 at 7:19 AM Rui Salvaterra wrote: > After a bit more digging [1], I believe I've narrowed it down. > CONFIG_THUMB2_AVOID_R_ARM_THM_JUMP11=y is required in order to avoid > the emission of R_ARM_THM_JUMP11 relocations in the WireGuard module. > I'm now wondering why the compat modules haven't exhibited the same > problem (maybe it was just a fluke), but since this kconfig option > effectively implies -fno-optimize-sibling-calls [2], it's quite a > hefty hammer. Is this something that can be solved in the WireGuard > build itself? > > Thanks in advance, > Rui > > [1] https://github.com/openwrt/openwrt/pull/3079#issuecomment-645297337 > [2] > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/arm/Makefile?h=linux-5.4.y#n125 Ahh hah, nice detective work. Reading the Kconfig description, it looks like this is actually a toolchain bug with modules in general: config THUMB2_AVOID_R_ARM_THM_JUMP11 bool "Work around buggy Thumb-2 short branch relocations in gas" depends on THUMB2_KERNEL && MODULES default y help Various binutils versions can resolve Thumb-2 branches to locally-defined, preemptible global symbols as short-range "b.n" branch instructions. This is a problem, because there's no guarantee the final destination of the symbol, or any candidate locations for a trampoline, are within range of the branch. For this reason, the kernel does not support fixing up the R_ARM_THM_JUMP11 (102) relocation in modules at all, and it makes little sense to add support. The symptom is that the kernel fails with an "unsupported relocation" error when loading some modules. Until fixed tools are available, passing -fno-optimize-sibling-calls to gcc should prevent gcc generating code which hits this problem, at the cost of a bit of extra runtime stack usage in some cases. The problem is described in more detail at: https://bugs.launchpad.net/binutils-linaro/+bug/725126 Only Thumb-2 kernels are affected. Unless you are sure your tools don't have this problem, say Y. ___ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Re: [OpenWrt-Devel] wireguard: unknown relocation: 102 [ARMv7 Thumb-2]
Hi again, Jason, On Wed, 10 Jun 2020 at 11:09, Jason A. Donenfeld wrote: > > Eventually I can probably get this building and testing and find some > hardware for this and such. But if you'd like things to move faster, > trying to reproduce the issue in the qemu test suite will result in a > quicker fix. After a bit more digging [1], I believe I've narrowed it down. CONFIG_THUMB2_AVOID_R_ARM_THM_JUMP11=y is required in order to avoid the emission of R_ARM_THM_JUMP11 relocations in the WireGuard module. I'm now wondering why the compat modules haven't exhibited the same problem (maybe it was just a fluke), but since this kconfig option effectively implies -fno-optimize-sibling-calls [2], it's quite a hefty hammer. Is this something that can be solved in the WireGuard build itself? Thanks in advance, Rui [1] https://github.com/openwrt/openwrt/pull/3079#issuecomment-645297337 [2] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/arm/Makefile?h=linux-5.4.y#n125 ___ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Re: [OpenWrt-Devel] wireguard: unknown relocation: 102 [ARMv7 Thumb-2]
On Wed, Jun 10, 2020 at 4:05 AM Rui Salvaterra wrote: > > Hi, Jason, > > On Wed, 10 Jun 2020 at 10:31, Rui Salvaterra wrote: > > > > Good question. :) You're testing in QEMU (which I personally never > > used), right? I don't know how familiar you are with OpenWrt, but I > > can surely send you my configuration (it's spread across multiple > > files, though). > > Ok, so this is what I do (on a pristine tree, after cloning the > buildroot and the packages feed): > > First, I change the CPU subtype to neon (sadly, the Armada 385 is > castrated upstream since the 370 only supports VFPv3-D16 :/). > > diff --git a/target/linux/mvebu/cortexa9/target.mk > b/target/linux/mvebu/cortexa9/target.mk > index cdd4d86e49..9af3c95d7b 100644 > --- a/target/linux/mvebu/cortexa9/target.mk > +++ b/target/linux/mvebu/cortexa9/target.mk > @@ -10,5 +10,5 @@ include $(TOPDIR)/rules.mk > ARCH:=arm > BOARDNAME:=Marvell Armada 37x/38x/XP > CPU_TYPE:=cortex-a9 > -CPU_SUBTYPE:=vfpv3-d16 > +CPU_SUBTYPE:=neon > KERNELNAME:=zImage dtbs > > Then, I use the attached configuration files. The .config (for > OpenWrt) in the buildroot, and config-default (for the kernel itself) > in target/linux/mvebu/cortexa9/. > > Let me know if you need anything else! Eventually I can probably get this building and testing and find some hardware for this and such. But if you'd like things to move faster, trying to reproduce the issue in the qemu test suite will result in a quicker fix. Jason ___ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Re: [OpenWrt-Devel] wireguard: unknown relocation: 102 [ARMv7 Thumb-2]
Hi, Jason, Thanks for the quick reply! On Wed, 10 Jun 2020 at 10:20, Jason A. Donenfeld wrote: > > Is there some config combination you can stick into the test harness > to repro what you're seeing? Good question. :) You're testing in QEMU (which I personally never used), right? I don't know how familiar you are with OpenWrt, but I can surely send you my configuration (it's spread across multiple files, though). Thanks, Rui ___ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Re: [OpenWrt-Devel] wireguard: unknown relocation: 102 [ARMv7 Thumb-2]
Hi Rui, I'm unable to reproduce this: $ git clone https://git.zx2c4.com/wireguard-linux-compat $ ARCH=arm make -C wireguard-linux-compat/src test-qemu -j$(nproc) [... big test suite ...] $ vim wireguard-linux-compat/qemu-build/arm/linux-5.5.14/.config [... enable CONFIG_THUMB2_KERNEL=y ...] $ ARCH=arm make -C wireguard-linux-compat/src test-qemu -j$(nproc) [... big test suite ...] Is there some config combination you can stick into the test harness to repro what you're seeing? Jason ___ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel
[OpenWrt-Devel] wireguard: unknown relocation: 102 [ARMv7 Thumb-2]
Hi, Jason, I'm trying to build OpenWrt master with Thumb-2 instructions for my Turris Omnia (both kernel and userspace) with GCC 9.3.0 and Binutils 2.34 from the toolchain. [1] Everything seems to work apart from WireGuard, for some reason the module won't load, throwing the relocation error in $subject (other backported compat modules load just fine). Do you have any idea about the possible cause? This is mostly a heads-up, since I'm surely treading officially unsupported grounds. ;) Thanks, Rui [1] Interestingly enough, the final image is bigger (maybe the Thumb-2 encoding is less redundant and doesn't compress as well as ARM?). ___ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel