Re: Broadcom BCM2709, ARMv8, and missing CPU features
On Sat, Aug 06, 2016 at 04:25:13PM +0100, Luke Kenneth Casson Leighton wrote: > did i hear right that there's also a core design difference between > the A7 and the A53 which results in a performance/watt loss of around > 15%? so you're actually *worse off* going to 64-bit at the moment, if > power (battery life) really matters. i think it was on anandtech or > something. Well the 64 bit equivalant of the A7 would be the A35. The A53 is higher level, more like an A9 I would think while the A57 is A15 type of performance level and A72 probably maps to about A17, or maybe even better. Certainly in terms of MIPS/MHz, the A53 is between the A8 and A9, while the A57 is A15/A17 level, and the A72 is quite a bit faster than the rest. Apparently there is an A73 coming that adds another 30% performance over the A72. The A35 is supposed to be 6 to 40% better performance than the A7 at the same power. -- Len Sorensen
Re: Broadcom BCM2709, ARMv8, and missing CPU features
On Tue, Jul 26, 2016 at 11:28 PM, Jeffrey Waltonwrote: > Hi Everyone, > > I recently purchased a Raspberry Pi 3. Its got a Broadcom SoC, and its > ARMv8. Its running a Debian-lite kernel, which I believe is a modified > 4.4 kernel. > > Below is the output from cpuinfo. I see ARMv8's crc32 is available, > but I don't see pmull, aes or sha. At the moment, I'm not sure if its > truly missing, or the execution environment is not quite correct. > > My question is, what's going on with the device? Is the hardware truly > lacking the features, or is the image lagging behind capabilities? In case anyone is interested... The Raspberry Pi (Broadcom SoC) and the ODROID C2 (Amlogic SoC) include CRC32, but lack the Crypto extensions. HiKey and Pine64's have both the CRC32 and the Crypto extensions. Jeff
Re: Broadcom BCM2709, ARMv8, and missing CPU features
> the reason i ask that is, i'm not seeing any real difference: you > still have to download the linux kernel source (to submit dtsi > patches), the linux git repo is still the central location for dtsi > management... unless you're happy to set up an alternative parallel > repository (and compile infrastructure) for dtsi management... thus > you still have to download the full git repo, you still have to > compile stuff *from* that same git repo where's the actual benefit > to having moved to dtsi, in terms of "work needed to maintain it"? I don't do much kernel hacking, myself. So I'm talking about use of apt-get. The difference is very significant because Debian never maintained enough different kernels. Stefan
Re: Broadcom BCM2709, ARMv8, and missing CPU features
--- crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68 On Sat, Aug 6, 2016 at 8:15 PM, Stefan Monnierwrote: >> the only big advantage of dtb files (binary compiled) is *IF* the >> decision is made to respect dtb files and treat them as inviolate >> and supported forever without needing recompiles, you stand a >> chance of being able to upgrade linux kernels *without* replacing >> the dtb file. > > That might be true when compared to some potential replacement of DTBs, > but when compared to what we had before DTBs, then the benefit is much > more clear: a single linux-image-armhf package which works for "all" > machines. Personally I don't mind changing the DTB every time I change > the kernel. Hell, that could/should be integrated with the process > which refreshes the initrd file anyway. ... are you _sure_ it's clear? :) the reason i ask that is, i'm not seeing any real difference: you still have to download the linux kernel source (to submit dtsi patches), the linux git repo is still the central location for dtsi management... unless you're happy to set up an alternative parallel repository (and compile infrastructure) for dtsi management... thus you still have to download the full git repo, you still have to compile stuff *from* that same git repo where's the actual benefit to having moved to dtsi, in terms of "work needed to maintain it"? i appreciate you don't *mind* changing the DTB file each time you change the kernel, but that defeats one of the very purposes *of* the DTB file. also, i don't know if you've looked in arch/arm/boot/dts but it's already alarmingly full. i appreciate that there's some includes (dtsi) but realistically over time the sharing process is going to begin to look like the selinux m4 macro includes or the openembedded infrastructure: an unintelligeable and unmaintainable dog's dinner that only a handful of people in the world can understand. anyway to get back to the original topic, there's very little that can actually shared - even with devicetree - between different devices. it's the "N product design types" times "M processors" thing. which is why i'm designing a hardware standard that's similar to how things are in the x86 world, so that we can get back to "N PLUS M" at the linux kernel level. l.
Re: Broadcom BCM2709, ARMv8, and missing CPU features
> the only big advantage of dtb files (binary compiled) is *IF* the > decision is made to respect dtb files and treat them as inviolate > and supported forever without needing recompiles, you stand a > chance of being able to upgrade linux kernels *without* replacing > the dtb file. That might be true when compared to some potential replacement of DTBs, but when compared to what we had before DTBs, then the benefit is much more clear: a single linux-image-armhf package which works for "all" machines. Personally I don't mind changing the DTB every time I change the kernel. Hell, that could/should be integrated with the process which refreshes the initrd file anyway. Stefan
Re: Broadcom BCM2709, ARMv8, and missing CPU features
--- crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68 On Sat, Aug 6, 2016 at 2:57 PM, Stefan Monnierwrote: > Note also that you will sometimes *lose* performance by going to 64bit > because the pointers use up twice as much space, so if your program > needs to store many pointers, it will use up more cache space > and memory bandwidth, which will tend to slow it down. did i hear right that there's also a core design difference between the A7 and the A53 which results in a performance/watt loss of around 15%? so you're actually *worse off* going to 64-bit at the moment, if power (battery life) really matters. i think it was on anandtech or something. l.
Re: Broadcom BCM2709, ARMv8, and missing CPU features
--- crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68 On Thu, Jul 28, 2016 at 5:35 PM, Gunnar Wolfwrote: > Keep in mind it's not different Debian images we are talking about — > "real" Debian cannot be booted on Raspberry hardware. I run a Debian > userland on top of their provided kernel (with the mystery blobs to > control its hardware), started by their mystery bootloader. And yes, > for us people coming from the x86 world, we expect similar devices to > "just work", but in ARM it *is* really a different way of doing things > per each kind of board. i did find it very funny to learn that Linus did not understand why there were so many ARM developers at the Cambridge Linux Conference back in... when was it... 2007? it coincided with UKUUG at the time. he's famously on record as saying "why are there so many of you? go away, choose one representative and come back with just one person!" likewise, i _am_ on record as pointing out a long long time ago that device-tree will not stop the proliferation or complexity of developing device drivers for ARM: it merely *moves* the proliferation and complexity... into dtsi files the only big advantage of dtb files (binary compiled) is *IF* the decision is made to respect dtb files and treat them as inviolate and supported forever without needing recompiles, you stand a chance of being able to upgrade linux kernels *without* replacing the dtb file. however i seriously doubt that the stringent testing needed to make that work will ever be put in place. oh well. l.
Re: Broadcom BCM2709, ARMv8, and missing CPU features
> physically). What you often gain going to a 64 bit CPU is the ability > to do 64 bit arithmetic in one instruction, and store the variables [...] > 32 bit calculations, then it doesn't matter, so in many cases it isn't an > issue, but when it matters it can really make a difference in performance. AFAIK the difference is only visible for operations on *integer* of size 64bit (and more). Some programs make significant use of such operations, but in general they're not that common. So I'd be surprised if "you often gain". Note also that you will sometimes *lose* performance by going to 64bit because the pointers use up twice as much space, so if your program needs to store many pointers, it will use up more cache space and memory bandwidth, which will tend to slow it down. IOW unless you know your workload very well, the best prediction I could make is "you won't notice any difference". In the x86 world, moving from i686 to amd64 has the additional advantage that the amd64 mode has more registers which is useful in many more cases than just the manipulation of large integers. And yet, even there I find it hard to notice any difference. Stefan
Re: Broadcom BCM2709, ARMv8, and missing CPU features
On 28/07/16 17:35, Gunnar Wolf wrote: I'm far from an absolute expert in this area... But I am fairly certain of what I say — That is, I have a RPi 1 and 2B, and they cannot boot from the same images. That depends what is in the image. The current raspberry pi firmware works on all pi models (older firmware will only work with older pi models). The Pi1 needs a specific kernel. The Pi2 and the Pi3 can run the same 32-bit kernel (at least with foundation kernels, I dunno what the situation is with upstream kernels) The firmware by default selects a suitable kernel (kernel.img or kernel7.img) and device tree (each Pi model has a different one though IIRC they are pretty similar) based on the detected hardware. There is some experimental 64-bit kernel/bootloader stuff out there for the raspberry pi 3 https://www.raspberrypi.org/forums/viewtopic.php?f=72=137963=aarch64 https://www.raspberrypi.org/forums/viewtopic.php?f=72=143765 Userland obviously has to be compatible with the hardware and kernel. Raspbian or Debian armel userlands should be usable on any Pi model. Debian armhf userland should be usable on a pi2 or pi3 but clearly not a pi1. Debian arm64 will obviously require a 64-bit kernel and a pi3.
Re: Broadcom BCM2709, ARMv8, and missing CPU features
On 2016-07-28, Gunnar Wolf wrote: > Alan Corey dijo [Thu, Jul 28, 2016 at 12:22:23PM -0400]: >> Huh? I thought they claimed they were interchangeable. I had an >> image from my model B days 3 years ago that I booted on my 3B. And I >> cloned a working current 3B SD card and booted a Zero from it. There >> isn't a different Debian image for every brand of motherboard and CPU, >> they probe to see what hardware is there. I wouldn't expect older >> images to contain drivers for newer hardware maybe. ... > I'm far from an absolute expert in this area... But I am fairly > certain of what I say — That is, I have a RPi 1 and 2B, and they > cannot boot from the same images. I believe they ship different kernels for different boards all on one image. Or, at least (used to) ship an rpi1 and rpi2 kernel; not sure if the rpi3 uses the same kernel as the rpi2 (possibly with a different device-tree). > Keep in mind it's not different Debian images we are talking about — > "real" Debian cannot be booted on Raspberry hardware. I run a Debian > userland on top of their provided kernel (with the mystery blobs to > control its hardware), started by their mystery bootloader. Well, I've got three Raspberry PI 2 boards running kernels shipped by debian(either jessie-backports or experimental) and u-boot shipped by debian, but it does require the the GPU firmware to bootstrap the CPU. live well, vagrant signature.asc Description: PGP signature
Re: Broadcom BCM2709, ARMv8, and missing CPU features
On Thu, Jul 28, 2016 at 12:22:23PM -0400, Alan Corey wrote: > Huh? I thought they claimed they were interchangeable. I had an > image from my model B days 3 years ago that I booted on my 3B. And I > cloned a working current 3B SD card and booted a Zero from it. There > isn't a different Debian image for every brand of motherboard and CPU, > they probe to see what hardware is there. I wouldn't expect older > images to contain drivers for newer hardware maybe. > > I guess I wouldn't make too much of the jump to 64 bit just yet. I > remember when i386 jumped to 32 bit. 16 bit had a messy segmented > memory addressing scheme I was glad to get away from. I can't afford > more than 32 bits worth of RAM anyway, especially since I've usually > got about 4 machines running. Well it isn't actually just a question of memory (most 8bit CPUs had 16 bit address space, and many 16 bit CPUs had 24 or 32 bit address space, and some 32 bit x86 and arm chips can do 36 or 40 bit address space physically). What you often gain going to a 64 bit CPU is the ability to do 64 bit arithmetic in one instruction, and store the variables in one register rather than two, rather than a bunch of stuff the compiler generates for you. After all if you take two 64 bit integrs and try to multiply them on a 32 bit CPU, most of the time you end up with numerous multiply, shift, add, mask, instructions to implement the calculation using 32 bit only instructions, while on a 64 bit CPU usually it is just one instruction. So the 64 bit CPU will probably do the calculation faster than the 32 bit CPU. Of course if you only need 32 bit calculations, then it doesn't matter, so in many cases it isn't an issue, but when it matters it can really make a difference in performance. -- Len Sorensen
Re: Broadcom BCM2709, ARMv8, and missing CPU features
Alan Corey dijo [Thu, Jul 28, 2016 at 12:22:23PM -0400]: > Huh? I thought they claimed they were interchangeable. I had an > image from my model B days 3 years ago that I booted on my 3B. And I > cloned a working current 3B SD card and booted a Zero from it. There > isn't a different Debian image for every brand of motherboard and CPU, > they probe to see what hardware is there. I wouldn't expect older > images to contain drivers for newer hardware maybe. > > I guess I wouldn't make too much of the jump to 64 bit just yet. I > remember when i386 jumped to 32 bit. 16 bit had a messy segmented > memory addressing scheme I was glad to get away from. I can't afford > more than 32 bits worth of RAM anyway, especially since I've usually > got about 4 machines running. I'm far from an absolute expert in this area... But I am fairly certain of what I say — That is, I have a RPi 1 and 2B, and they cannot boot from the same images. Keep in mind it's not different Debian images we are talking about — "real" Debian cannot be booted on Raspberry hardware. I run a Debian userland on top of their provided kernel (with the mystery blobs to control its hardware), started by their mystery bootloader. And yes, for us people coming from the x86 world, we expect similar devices to "just work", but in ARM it *is* really a different way of doing things per each kind of board. I can suggest you to see the talk delivered by Martin Michlmayr some weeks ago at DebConf on this topic: http://ftp.acc.umu.se/pub/debian-meetings/2016/debconf16/Debian_on_ARM_devices_2.webm
Re: Broadcom BCM2709, ARMv8, and missing CPU features
Huh? I thought they claimed they were interchangeable. I had an image from my model B days 3 years ago that I booted on my 3B. And I cloned a working current 3B SD card and booted a Zero from it. There isn't a different Debian image for every brand of motherboard and CPU, they probe to see what hardware is there. I wouldn't expect older images to contain drivers for newer hardware maybe. I guess I wouldn't make too much of the jump to 64 bit just yet. I remember when i386 jumped to 32 bit. 16 bit had a messy segmented memory addressing scheme I was glad to get away from. I can't afford more than 32 bits worth of RAM anyway, especially since I've usually got about 4 machines running. On 7/28/16, Gunnar Wolfwrote: > Alan Corey dijo [Wed, Jul 27, 2016 at 01:28:31PM -0400]: >> > 64-bit/ARMv8 on the RPi3 is still in progress. >> >> Yes, so they claim and I wonder how they're going to deal with the >> fact that some Pis are 32 bit and some 64. I posted this question >> there but I haven't looked into the links in the response a lot: >> https://www.raspberrypi.org/forums/viewtopic.php?f=63=154497=1010500#p1010500 > > It should not be that much of a deal — After all, images for the > different generations of Raspberries are not interchangable — A RPi1 > won't boot a RPi2 image, nor viceversa. Of course, the earlier > generations could share all compiled binaries, while now it won't be > the case (when it actually runs 64, that is). > -- Credit is the root of all evil. - AB1JX
Re: Broadcom BCM2709, ARMv8, and missing CPU features
Alan Corey dijo [Wed, Jul 27, 2016 at 01:28:31PM -0400]: > > 64-bit/ARMv8 on the RPi3 is still in progress. > > Yes, so they claim and I wonder how they're going to deal with the > fact that some Pis are 32 bit and some 64. I posted this question > there but I haven't looked into the links in the response a lot: > https://www.raspberrypi.org/forums/viewtopic.php?f=63=154497=1010500#p1010500 It should not be that much of a deal — After all, images for the different generations of Raspberries are not interchangable — A RPi1 won't boot a RPi2 image, nor viceversa. Of course, the earlier generations could share all compiled binaries, while now it won't be the case (when it actually runs 64, that is).
Re: Broadcom BCM2709, ARMv8, and missing CPU features
> Using '.byte' below rather than '.inst' or '.inst.w' is another can of > worms... > > $ gcc -g3 -O0 -march=armv7-a -mfpu=neon test.cc -o test.exe > $ ./test.exe > $ > > $ cat test.cc > #include > int main(int argc, char* argv[]) > { > __asm__ __volatile__ > ( > ".code 32" > > // CRC using word > ".byte 0x1a, 0xc1, 0x58, 0x00;\n" > // CRC using half word > ".byte 0x1a, 0xc1, 0x54, 0x00;\n" > // CRC using byte > ".byte 0x1a, 0xc1, 0x50, 0x00;\n" > // PMULL > ".byte 0x0e, 0xe1, 0xe0, 0x00;\n" > // PMULL2 > ".byte 0x4e, 0xe1, 0xe0, 0x00;\n" > // AES (aese) > ".byte 0x4e, 0x28, 0x48, 0x20;\n" > // AES (aesd) > ".byte 0x4e, 0x28, 0x58, 0x20;\n" > // SHA1 (sha1c) > ".byte 0x5e, 0x02, 0x00, 0x20;\n" > // SHA1 (sha1m) > ".byte 0x5e, 0x02, 0x20, 0x20;\n" > // SHA1 (sha1p) > ".byte 0x5e, 0x02, 0x30, 0x20;\n" > : > : > : "cc", "d0", "d1", "d2", "q0", "q1", "q2" > ); > > return 0; > } All that silliness was not needed. All that was needed was (and maybe a float ABI flag): gcc -march=armv8-a+crc -mtune=cortex-a53 -mfpu=crypto-neon-fp-armv8 ... I can't believe I could not piece that together from the man pages (Thanks to the GCC and SO folks). Jeff
Re: Broadcom BCM2709, ARMv8, and missing CPU features
On Thu, Jul 28, 2016 at 3:06 AM, Tixywrote: > On Thu, 2016-07-28 at 02:38 -0400, Jeffrey Walton wrote: > [...] >> >> // AES (aese) >> >> ".byte 0x4e, 0x28, 0x48, 0x20;\n" >> > >> > So as instructions are little-endian that's 0x2048284e for a 32-bit >> > instruction, or 0x284e2048 if it's a Thumb2 instruction (I'm showing >> > that the same way as the ARM ARM does). >> >> I pulled the encodings from a known good machine that used intrinsics. >> I did not hand encode them (too much work). >> >> > According to my copy of the ARM ARM, the AESE instruction has these >> > encodings: >> > >> > For Thumb: >> > >> > 1 1 1 1 1 1 1 1 1 D 1 1 size 0 0 Vd 0 0 1 1 0 0 M 0 Vm >> > >> > For ARM >> > >> > 1 1 1 1 0 0 1 1 1 D 1 1 size 0 0 Vd 0 0 1 1 0 0 M 0 Vm >> > >> > For AArch64 >> > >> > 0 1 0 0 1 1 1 0 size 1 0 1 0 0 0 0 1 0 0 1 0 Rn Rd >> > >> > So it looks like you've used the AArch64 encoding (for something >> > compiled and presumably run as AArch32?!) and gotten the byte order the >> > wrong way around. >> >> I'm not sure if it matters, but this is an ARMv8 device running a 32-bit OS. > > So it's running in AArch32 mode, and you want the encodings for that, > not the AArch64 version. I.e. the second encoding I mentioned, which > would be > > .inst 0xf3b00300 OK, thanks. I *think* what may have happened is the disassembly occurred on host machine, not the target machine. > or better, find a compiler version and options that knows about the > instructions you want to test (which I see you already asked about > below). Sorry I can't help with that, I know little about toolchains, > and have also never used the newer ARM instruction features like VFP, > SIMD, crytpo etc. Yeah, this is quite painful at the moment. I think there's a disconnect between what's advertised to work, and what works in practice. Let me see if GCC 6.0 is available; or if Clang has better success. I'm doing my best to avoid building GCC myself. I have very bad memories from that experience. Jeff
Re: Broadcom BCM2709, ARMv8, and missing CPU features
On Thu, 2016-07-28 at 02:38 -0400, Jeffrey Walton wrote: [...] > >> // AES (aese) > >> ".byte 0x4e, 0x28, 0x48, 0x20;\n" > > > > So as instructions are little-endian that's 0x2048284e for a 32-bit > > instruction, or 0x284e2048 if it's a Thumb2 instruction (I'm showing > > that the same way as the ARM ARM does). > > I pulled the encodings from a known good machine that used intrinsics. > I did not hand encode them (too much work). > > > According to my copy of the ARM ARM, the AESE instruction has these > > encodings: > > > > For Thumb: > > > > 1 1 1 1 1 1 1 1 1 D 1 1 size 0 0 Vd 0 0 1 1 0 0 M 0 Vm > > > > For ARM > > > > 1 1 1 1 0 0 1 1 1 D 1 1 size 0 0 Vd 0 0 1 1 0 0 M 0 Vm > > > > For AArch64 > > > > 0 1 0 0 1 1 1 0 size 1 0 1 0 0 0 0 1 0 0 1 0 Rn Rd > > > > So it looks like you've used the AArch64 encoding (for something > > compiled and presumably run as AArch32?!) and gotten the byte order the > > wrong way around. > > I'm not sure if it matters, but this is an ARMv8 device running a 32-bit OS. So it's running in AArch32 mode, and you want the encodings for that, not the AArch64 version. I.e. the second encoding I mentioned, which would be .inst 0xf3b00300 or better, find a compiler version and options that knows about the instructions you want to test (which I see you already asked about below). Sorry I can't help with that, I know little about toolchains, and have also never used the newer ARM instruction features like VFP, SIMD, crytpo etc. > I'm still trying to figure out how to build test cases for an Aarch32 > execution on Aarch64. Eventually it will go into an open source > library's test script. Also see > https://gcc.gnu.org/ml/gcc-help/2016-06/msg00097.html. -- Tixy
Re: Broadcom BCM2709, ARMv8, and missing CPU features
>> Using '.byte' below rather than '.inst' or '.inst.w' is another can of >> worms... > > And if I'm not mistaken, the part of the reason why you got the > instructions wrong... > >> $ gcc -g3 -O0 -march=armv7-a -mfpu=neon test.cc -o test.exe >> $ ./test.exe >> $ > > Does the tool-chain default to ARM or Thumb? I assume ARM code. I believe its ARM. >> $ cat test.cc >> #include >> int main(int argc, char* argv[]) >> { >> __asm__ __volatile__ >> ( >> ".code 32" > > BTW, above selects ARM code generation, but won't have any affect > because you don't specify any labels or instruction mnemonics to > assemble. That's the only thing that managed to get a good disassembly from objdump -d. >> // AES (aese) >> ".byte 0x4e, 0x28, 0x48, 0x20;\n" > > So as instructions are little-endian that's 0x2048284e for a 32-bit > instruction, or 0x284e2048 if it's a Thumb2 instruction (I'm showing > that the same way as the ARM ARM does). I pulled the encodings from a known good machine that used intrinsics. I did not hand encode them (too much work). > According to my copy of the ARM ARM, the AESE instruction has these > encodings: > > For Thumb: > > 1 1 1 1 1 1 1 1 1 D 1 1 size 0 0 Vd 0 0 1 1 0 0 M 0 Vm > > For ARM > > 1 1 1 1 0 0 1 1 1 D 1 1 size 0 0 Vd 0 0 1 1 0 0 M 0 Vm > > For AArch64 > > 0 1 0 0 1 1 1 0 size 1 0 1 0 0 0 0 1 0 0 1 0 Rn Rd > > So it looks like you've used the AArch64 encoding (for something > compiled and presumably run as AArch32?!) and gotten the byte order the > wrong way around. I'm not sure if it matters, but this is an ARMv8 device running a 32-bit OS. I'm still trying to figure out how to build test cases for an Aarch32 execution on Aarch64. Eventually it will go into an open source library's test script. Also see https://gcc.gnu.org/ml/gcc-help/2016-06/msg00097.html. If you know how to do it, then please email me on a sidebar. I'm happy to test theories. I think -mcpu=... factors into it somewhere. Jeff
Re: Broadcom BCM2709, ARMv8, and missing CPU features
On Thu, 2016-07-28 at 00:48 -0400, Jeffrey Walton wrote: > Using '.byte' below rather than '.inst' or '.inst.w' is another can of > worms... And if I'm not mistaken, the part of the reason why you got the instructions wrong... > $ gcc -g3 -O0 -march=armv7-a -mfpu=neon test.cc -o test.exe > $ ./test.exe > $ Does the tool-chain default to ARM or Thumb? I assume ARM code. > $ cat test.cc > #include > int main(int argc, char* argv[]) > { > __asm__ __volatile__ > ( > ".code 32" BTW, above selects ARM code generation, but won't have any affect because you don't specify any labels or instruction mnemonics to assemble. > // AES (aese) > ".byte 0x4e, 0x28, 0x48, 0x20;\n" So as instructions are little-endian that's 0x2048284e for a 32-bit instruction, or 0x284e2048 if it's a Thumb2 instruction (I'm showing that the same way as the ARM ARM does). According to my copy of the ARM ARM, the AESE instruction has these encodings: For Thumb: 1 1 1 1 1 1 1 1 1 D 1 1 size 0 0 Vd 0 0 1 1 0 0 M 0 Vm For ARM 1 1 1 1 0 0 1 1 1 D 1 1 size 0 0 Vd 0 0 1 1 0 0 M 0 Vm For AArch64 0 1 0 0 1 1 1 0 size 1 0 1 0 0 0 0 1 0 0 1 0 Rn Rd So it looks like you've used the AArch64 encoding (for something compiled and presumably run as AArch32?!) and gotten the byte order the wrong way around. Disclaimer, I'm only on my first coffee of the morning, so quite likely not 100% accurate in my statements above. ;-) -- Tixy
Re: Broadcom BCM2709, ARMv8, and missing CPU features
On Wed, Jul 27, 2016 at 2:18 AM, Paul Wisewrote: > On Wed, Jul 27, 2016 at 11:28 AM, Jeffrey Walton wrote: > >> I recently purchased a Raspberry Pi 3. Its got a Broadcom SoC, and its >> ARMv8. > ... >> model name: ARMv7 Processor rev 4 (v7l) > > Looks like you are running it in ARMv7 32-bit mode, perhaps that > disables the ARMv8 features. > > I recently watched the DebConf16 ARM talk and from memory support for > 64-bit/ARMv8 on the RPi3 is still in progress. I've had some time to kick the tires, so to speak. CPU flags indicate only crc32 from the ARMv8 instruction set. And I can't get the stock toolchain to consume other intrinsics, like PMULL and PMULL2. However, dropping into the GCC extended assembler, the program executes CRC, PMULL, PMULL2, AES, SHA1 and SHA2 without causing an illegal instruction. It would be nice if the Raspberry folks enabled the intrinsics and instructions in the toolchain for the devs who have the specialized code to take advantage of it. *** Using '.byte' below rather than '.inst' or '.inst.w' is another can of worms... $ gcc -g3 -O0 -march=armv7-a -mfpu=neon test.cc -o test.exe $ ./test.exe $ $ cat test.cc #include int main(int argc, char* argv[]) { __asm__ __volatile__ ( ".code 32" // CRC using word ".byte 0x1a, 0xc1, 0x58, 0x00;\n" // CRC using half word ".byte 0x1a, 0xc1, 0x54, 0x00;\n" // CRC using byte ".byte 0x1a, 0xc1, 0x50, 0x00;\n" // PMULL ".byte 0x0e, 0xe1, 0xe0, 0x00;\n" // PMULL2 ".byte 0x4e, 0xe1, 0xe0, 0x00;\n" // AES (aese) ".byte 0x4e, 0x28, 0x48, 0x20;\n" // AES (aesd) ".byte 0x4e, 0x28, 0x58, 0x20;\n" // SHA1 (sha1c) ".byte 0x5e, 0x02, 0x00, 0x20;\n" // SHA1 (sha1m) ".byte 0x5e, 0x02, 0x20, 0x20;\n" // SHA1 (sha1p) ".byte 0x5e, 0x02, 0x30, 0x20;\n" : : : "cc", "d0", "d1", "d2", "q0", "q1", "q2" ); return 0; }
Re: Broadcom BCM2709, ARMv8, and missing CPU features
On Thu, Jul 28, 2016 at 1:28 AM, Alan Corey wrote: > Yes, so they claim and I wonder how they're going to deal with the > fact that some Pis are 32 bit and some 64. ISTR that they plan on keeping 32-bit for the official stuff recommended by the RPi folks for simplification. -- bye, pabs https://wiki.debian.org/PaulWise
Re: Broadcom BCM2709, ARMv8, and missing CPU features
> 64-bit/ARMv8 on the RPi3 is still in progress. Yes, so they claim and I wonder how they're going to deal with the fact that some Pis are 32 bit and some 64. I posted this question there but I haven't looked into the links in the response a lot: https://www.raspberrypi.org/forums/viewtopic.php?f=63=154497=1010500#p1010500 -- Credit is the root of all evil. - AB1JX
Re: Broadcom BCM2709, ARMv8, and missing CPU features
On Wed, Jul 27, 2016 at 11:28 AM, Jeffrey Walton wrote: > I recently purchased a Raspberry Pi 3. Its got a Broadcom SoC, and its > ARMv8. ... > model name: ARMv7 Processor rev 4 (v7l) Looks like you are running it in ARMv7 32-bit mode, perhaps that disables the ARMv8 features. I recently watched the DebConf16 ARM talk and from memory support for 64-bit/ARMv8 on the RPi3 is still in progress. -- bye, pabs https://wiki.debian.org/PaulWise