It is worth adding that when I look at the LLVM back-end for ARM, the only two distinctions that are made for vfp3 over vfp2 are the use of the 16 additional FP registers and the use of special "vmov" instructions to materialize immediate values into FP registers. A scan of the code relating to these two areas might reveal something.
I tried to see if I could spot anything looking at the M5 ARM VmovImm* instructions but my knowledge of M5 is still a bit limited at this point and I couldn't spot anything unusual offhand. On Wed, Apr 27, 2011 at 9:41 AM, Marc de Kruijf <[email protected]>wrote: > Fortunately for me, vfp2 support is sufficient for my current purpose, so I > don't feel a need to fix these problems in the near-term. I do appreciate > the pointers though for future reference. > > The exact CodeSourcery GNU/Linux Lite cross compiler I use is the latest > version available at > http://www.codesourcery.com/sgpp/lite/arm/portal/release1600 > > I downloaded the "Advanced Package": > > http://www.codesourcery.com/sgpp/lite/arm/portal/package7851/public/arm-none-linux-gnueabi/arm-2010.09-50-arm-none-linux-gnueabi-i686-pc-linux-gnu.tar.bz2 > > Since SPEC 2006 doesn't come with an ARM .cfg file I had to create my own, > but I haven't had any problems with it that I am aware. I've attached the > .cfg file I use for the CodeSourcery cross compiler compiling for vfp3, > which yields the problems I mentioned. Also, I forgot to mention before > that I run these benchmarks with their reference inputs. > > I hope that helps. > > Thanks for all the great work on M5. It is a joy to work with -- > definitely among the finest written and best designed pieces of software I > have ever used! > > Marc > > On Tue, Apr 26, 2011 at 11:44 PM, Ali Saidi <[email protected]> wrote: > >> I've attached a patch that I used the last time I debugged something with >> the statetrace program. It might not be perfect for your situation, but it >> addressed the issues I had when comparing execution with a TI beagle board >> and some reasonably recent version of ubuntu. >> >> Ali >> >> >> >> On Apr 26, 2011, at 5:36 PM, Gabriel Michael Black wrote: >> >> > Hi Marc. Thanks for verifying on real hardware. We have a program in >> util/ called statetrace which I use for figuring out these sorts of issues. >> It runs the program in question on a real system under ptrace and compares >> many of the registers cycle by cycle with what M5 is getting. This would >> help you determine where things go off track. >> > >> > There are a few drawbacks to using this tool though. First, since it >> compares execution instruction by instruction, everything has to line up >> exactly. You'll have to turn off address space randomization, make sure the >> environment variables and command line arguments (including file paths) are >> exactly the same, get information about files to match so IO is buffered the >> same, make the kernel version match in uname, and possibly add/remove any >> auxilliary vectors. There can't be any randomization in the code, and if >> there is it has to be synced between the real and simulated code. There's a >> blob of code the kernel links in which can also cause divergences in two >> ways. First, the actual code may change between versions of the kernel. >> Second, because the code is mapped read only, software breakpoints don't >> work. The implementation of ptrace in the kernel uses them, so if you single >> step into that blob, execution escapes and the program runs to completion. >> There are some hacks that should address that, but it's not perfect. Also, >> running under statetrace is slower than normal. I'm not sure if running to 5 >> billion instructions is practical. Also divergences in memory are not >> tracked. If a store stores the wrong thing or to the wrong place, you won't >> see it until a load accesses it. This is usually obnoxious to track down, >> but it can be done. >> > >> > But if you do address all the possible divergences and get it to run, it >> should notice when execution diverges. That will give you a really good clue >> as to what's going wrong. >> > >> > All this assumes that you're trying to debug a fixed binary. You could >> alternatively modify the benchmarks you're running and add debugging output, >> extra checks, etc., to try to narrow in on what the problem is. >> > >> > I'm sure this sounds like more work than you were hoping for and its >> unfortunate things didn't just work for you, but if you're willing to track >> down the problem we'd really appreciate it. Sorry, thanks, and good luck! >> > >> > Gabe >> > >> > Quoting Marc de Kruijf <[email protected]>: >> > >> >> I apologize for the delay in resonding; I wanted to make sure that I >> didn't >> >> experience these problems running on actual ARM hardware. It appears >> that I >> >> do not. >> >> >> >> I found through experimentation that I only experience issues when I >> compile >> >> with fp support beyond vfp2; I don't have problems for vfp2, but I do >> see >> >> occasional runtime errors compiling for vfp3 and neon. So, for >> instance, if >> >> I give my gcc cross-compiler these options "-mfpu=vfpv3 >> -mfloat-abi=softfp", >> >> I see some issues crop up (more information below). >> >> >> >> I am simulating the first 5 billion instructions in ARM_SE mode for >> 17/19 of >> >> the SPEC 2006 C/C++ benchmarks. I use two different compilers, one is >> the >> >> freely available CodeSourcery Lite GNU/Linux cross-compiler from >> >> CodeSourcery, and the other is an LLVM-based cross compiler. The LLVM >> >> compiler uses the same CodeSourcery assembler, but generates different >> >> machine code. >> >> >> >> Below are the errors reported by each compiler. The differences >> between >> >> compilers are most likely due to different instruction choices, >> although it >> >> may also have to with the limited simulation time (5B insts). As I >> >> mentioned, these issues don't appear with vfp2. >> >> >> >> LLVM and CodeSourcery: >> >> 444.namd (checksum failure) >> >> 462.libquantum (free of invalid pointer) >> >> 482.sphinx3 (runtime error - illegal log base) >> >> LLVM only: >> >> 447.dealII (llegal page fault) >> >> >> >> CodeSourcery only: >> >> 445.gobmk (assertion failure) >> >> 453.povray (version number mismatch) >> >> >> >> >> >> >> >> >> >> On Sun, Apr 17, 2011 at 11:48 AM, Ali Saidi <[email protected]> wrote: >> >> >> >>> >> >>> Specifically, could you provide us with exactly what you're running >> (SE vs. >> >>> FS), spec benchmark, compiler, options, etc. >> >>> >> >>> Thanks, >> >>> Ali >> >>> >> >>> On Apr 17, 2011, at 4:10 AM, Gabe Black wrote: >> >>> >> >>> Yes, they are fully supported. What problems are you seeing? >> >>> >> >>> Gabe >> >>> >> >>> On 04/16/11 20:52, Marc de Kruijf wrote: >> >>> >> >>> Are ARM hardware floating point instructions fully supported in M5? >> When I >> >>> compile with hardware FP I see problems running some SPEC2006 >> benchmarks >> >>> whereas I don't see the same problems when I compile using software >> FP. >> >>> >> >>> Thank you! >> >>> Marc >> >>> >> >>> >> >>> _______________________________________________ >> >>> m5-users mailing [email protected]:// >> m5sim.org/cgi-bin/mailman/listinfo/m5-users >> >>> >> >>> >> >>> _______________________________________________ >> >>> m5-users mailing list >> >>> [email protected] >> >>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users >> >>> >> >>> >> >>> >> >>> _______________________________________________ >> >>> m5-users mailing list >> >>> [email protected] >> >>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users >> >>> >> >> >> > >> > >> > _______________________________________________ >> > m5-users mailing list >> > [email protected] >> > http://m5sim.org/cgi-bin/mailman/listinfo/m5-users >> > >> >> >> _______________________________________________ >> m5-users mailing list >> [email protected] >> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users >> > >
_______________________________________________ m5-users mailing list [email protected] http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
