Re: Error in GCC documentation page
But in the C++ standard integral expression is more common. integral is an adjective and integer is a noun. integer expression, though gramatically wrong (or, at best, an elision of two nouns), is perfectly clear and unambiguous, whereas integral expression, though gramatically correct, hits some people as built-in expression and trips others up as an unfamiliar and rare word whose meaning is uncertain - for what gain? Personally, I like integral expression, but then I'm a native-English speaker and UK academic with an extended vocabulary. For world-class dovumentation, it depends whether it's more important to be clear and unambiguous to all readers or an object lesson in type-correct advanced English. I'd say our friend has pointed out a tiny place where it could be made a little more effective in the first of these purposes. M
Re: Deprecating ARM FPA support (was: ARM Neon Tests Failing on non-Neon Target)
On 6/27/10, Gerald Pfeifer ger...@pfeifer.com wrote: On Mon, 24 May 2010, Richard Kenner wrote: I think that's a critical distinction. I can't see removing a port just because it's not used much (or at all) because it might be valuable for historical reason or to show examples for how to do things. I'd say a port with zero known users should actually be removed. FPA is very widely used. From day 0 until 2006 it was the only FP model emulated by the Linux kernel and so in required by all operating systems created up to that date. Actively-maintained software distributions and recent ports of Linux tend to use a different ABI (EABI) whose default FP model is user-space softfloat and does not require FPA code generation (thankfully!), however there are many exiting software distributions in current use that only support emulated hard FPA instructions. For ARM boards without mainline Linux support whose manufacturers' kernel ports predates 2.6.16, it is mandatory, as is also is for users who just want to compile code for a given existing system that happens not to be running a recent kernel and userspace. M
Re: Patch pinging
Still, we'll see... Apparently not :( Why not? At most, you just need not to make sure nothing ever send mail to people who think that kind of thing is bozoid... M
Re: Patch pinging
On 6/8/10, NightStrike nightstr...@gmail.com wrote: Are you volunteering to write that small script? DUnno, are you volunteering to write that small script? You're the only one here actually volunteering a forwardgoing commitment of their time here to improve GCC's development in this way, it seems (and mostly just getting vilified for it, for using a bizarre camelcase name!) What I expected to happen was that you would start doing whta you envision should happen by hand, and would then get so bored at doing it that out of laziness you'd automate it somehow. :) Still, we'll see... M
Re: Patch pinging
On 6/7/10, NightStrike nightstr...@gmail.com wrote: On Wed, Jun 2, 2010 at 3:17 PM, Diego Novillo dnovi...@google.com wrote: On Wed, Jun 2, 2010 at 14:09, NightStrike nightstr...@gmail.com wrote: threads that haven't been addressed. I offered to Ian to do the same thing for the whole mailing list if we can make it a policy that people who commit changes do what Kai is doing so that it's clear that the thread is done with. I don't mind throwing a few pings down, and I already have the whole ML tagged with a gmail label. Seems like a good idea to me. I do not usually read the list every day (or every week some times), so if a patch is in my area and I had not been directly CC'd, it can take me up to 2 weeks to get to it. Most of the areas I'm on had good coverage (particularly since I share much with richi who is a very prolific patch reviewer), so it's not too much of a problem. Ok. Is one person responding enough for me to start doing that? I don't know how this sort of approval / acceptance process works for GCC. Excellent idea and thanks for volunteering.. M
Re: merging the maverick FPU patches
On 4/25/10, Ian Lance Taylor i...@google.com wrote: Martin Guy martinw...@gmail.com writes: now that stage3 is over I'm thinking of updating the MaverickCrunch FPU fixes (currently for 4.3) and merging them but would appreciate some guidance. There are 26 patches in all and I can't expect anyone to understand them because they require a good understanding of the FPU and its hardware bugs (and there are a lot of them!) :) What's the best path? Create a branch, work there and then merge when done? I have done the copyright assignment stuff but don't have an account on gcc.gnu.org. They all affect files under config/arm/ apart from one testsuite fix and the docs. For a backend specific patch like this, I would recommend contacting the ARM backend maintainers directly to ask how they would prefer to proceed. They can be found in the top level MAINTAINERS file: arm portNick Cliftonni...@redhat.com arm portRichard Earnshawrichard.earns...@arm.com arm portPaul Brook p...@codesourcery.com Hi I've had no reply from anyone - maybe everyone is hoping someone else do so. :) Of the three companies, redhat would be the most suitable, since the original unfinished port was done by them, and I guess ARM has no interest in making GCC work with non-ARM FPUs. The code they add/remove/change is pretty self-contained and doesn't impact the other code generation options. It just fixes the current implementation. Nick, are you willing to do the necessary? Since it just fixes existing code that never worked, all it requires from a maintainer is to check that it doesn't break code generation for other targets, which is easy to check automatically by testing a sample of targets and it's not hard to check by eye that the changes are only active when TARGET_MAVERICK. Cheers M
Re: Deprecating ARM FPA support
On 5/24/10, Mark Mitchell m...@codesourcery.com wrote: Certainly removing support for FPA (and any targets that require it) as a first step would be an option; but we should also focus on where we want to get to. I agree with that. But, it would also be interesting to know just how broken that code is. If, in fact, FPA and/or ARM ELF mostly work at present, then there's less call for actually removing (as opposed to deprecating) things. FPA code generation is 100% good AFAIK, and has been used intensively for years (as the FPU model for all gnu/linux ports before EABI). Maverick is the one that has never worked since it was submitted; I have patches that make it 100% good (well, ok, no known failure cases) but don't know how to get them into mainline. M
Re: What is the best way to resolve ARM alignment issues for large modules?
On 5/7/10, Shaun Pinney shaun.pin...@bil.konicaminolta.us wrote: Essentially, we have code which works fine on x86/PowerPC but fails on ARM due to differences in how misaligned accesses are handled. The failures occur in multiple large modules developed outside of our team and we need to find a solution. The best question to sum this up is, how can we use the compiler to arrive at a complete solution to quickly identify all code locations which generate misaligned accesses and/or prevent the compiler from generating misaligned accesses? Dunno about the compiler, but if you use the Linux kernel you can fiddle with /proc/cpu/alignment. By default it's set to 0, which silently gives garbage results when unaligned accesses are made. echo 3 /proc/cpu/alignment will fix those misalignments using a kernel trap to emulate correct behaviour (i.e. loading from bytes (char *)a to (char *)a + 3 in the case of an int). Alternatively, echo 5 /proc/cpu/alignment will make an unaligned access cause a Bus Error, which usually kills the process and you can identify the offending code by running it under gdb. Eliminating the unaligned accesses is tedious work, but the result will run slightly faster than relying on fixups, as well as making it portable to any word-aligned system. M
merging the maverick FPU patches
now that stage3 is over I'm thinking of updating the MaverickCrunch FPU fixes (currently for 4.3) and merging them but would appreciate some guidance. There are 26 patches in all and I can't expect anyone to understand them because they require a good understanding of the FPU and its hardware bugs (and there are a lot of them!) :) What's the best path? Create a branch, work there and then merge when done? I have done the copyright assignment stuff but don't have an account on gcc.gnu.org. They all affect files under config/arm/ apart from one testsuite fix and the docs. M
Re: Why not contribute? (to GCC)
OK, now that stage3 is over I'm thinking of updating the MaverickCrunch FPU fixes (currently for 4.3) and merging them but would appreciate some guidance. There are 26 patches in all and I can't expect anyone to understand them because they require a good understanding of the FPU and its hardware bugs (and there are a lot of them!) :) What's the best path? Create a branch, work there and then merge when done? I have done the copyright assignment stuff but don't have an account on gcc.gnu.org. They all affect files under config/arm/ apart from one testsuite fix and the docs. The missing part is a huge testsuite for it. I confess I find that daunting; it is potentially huge in that it replaces a non-working code generator with a working one, and for the non-working one there were *no* fpu-specific tests. Do I really need to write an entire validation suite? M
Re: Change x86 default arch for 4.5?
I disagree with the default cpu should 95% of what is currently on sale argument. The default affects naive users most strongly, so it should just work on as many processors as is reasonable, not be as fast as possible on most of the majority of the processors currently on sale. Naive users might have anything. As an example of the results of that kind of thinking, we've had years of pain and wasted time in the ARM world due to the default architecture being armv5t instead of armv4t. The results are that user after user, making their first steps compiling a cross toolchain, turns up on the mailing lists having got illegal instruction after days of work, and that almost all the distributions are forced to carry an unbreak armv4t patch to GCC. Lord, someone was even compelled to try and get Android working on their Openmoko, while it was binary-only, by emulating the few trivial instructions in the kernel. Ubuntu, similarly, excludes the lower end by leaving the default unchanged. When users get interested in maximal speed, the first thing they do is go for -mcpu=xyz, which doesn't require them to recompile the compiler and is educational, while software distributions who build their own compilers make their own choices about the minimum processors they want to support. Of course, most GCC developers and 95% of the people they know probably do all have big new PCs using processors that are currently in production, so to their eyes it might seem that all the world has SSE2. However most people in the world cherish anything that works at all; why make life harder for them? We don't see the bottom end in the various usage stats because most people using them don't have the internet (or a landline phone, come to that). The time-honoured policy of having the default settings work on as wide a range hardware as possible is a socially inclusive one. Some manifestos chisel the low bar into their constitutions (Debian for example); it would be nice for GCC to do so too. Cheers M You can't buy a computer these days with less that a gigabyte. -- A.S.Tanenbaum, trying to defend Minix's fixed-size kernel arrays at FOSDEM 2010
Re: Change x86 default arch for 4.5?
On 2/21/10, Dave Korn dave.korn.cyg...@googlemail.com wrote: It makes perfect sense that configuring for i686-*-* should get you an i686 compiler and configuring for i586-*-* should get you an i586 compiler and so on, rather than that you get an i386 compiler no matter what you asked for. Agreed M
Re: Change x86 default arch for 4.5?
On 2/21/10, Steven Bosscher stevenb@gmail.com wrote: It is interesting how this conflicts with your signature: You can't buy a computer these days with less that a gigabyte. -- A.S.Tanenbaum, trying to defend Minix's fixed-size kernel arrays at FOSDEM 2010 I take it you disagree with this? Because most people do not expect to need 1GB for a Minix installation. ;-) It's a straw man, another example of bogus reasoning. Of course you can buy new computers with 64MB, and they are particularly suitable for simple kernels. The embedded Linux/BSD crowd at the presentation didn't seem very impressed either. You want to cater for a minority with old hardware. I actually expect you'll find that those users are less naive than the average gcc user. I want to cater for everyone, especially youngsters, learners and the poor struggling with whatever they can get their hands on. It's not even a rich country/poor country thing: I live in a run down industrial area of England where the local kids are gagging for anything that works. Can you name these distributions? I can only name Debian (http://lists.debian.org/debian-arm/2006/06/msg00015.html) A quick search for unbreak-armv4t.patch shows, at a glance on the first ten hits, fedora, openembedded, slind, openmoko, mamona, android-porting. I'll leave you to peruse page two on :) Ubuntu also requires i686 or later. Ubuntu also needs 384MB to work these days, so it is a reasonable application-specific choice for that distro. GCC should not be tailored to high-end desktop, laptop and server machines. But anyway, bringing ARM into this discussion is neither here nor there. It is a specific example of a pointlessly higher cpu default (for arm-*) where such a decision was made in GCC and the annoyances it causes. which is what this thread had drifted into. Your naive users (and mine) don't even know about -mcpu and -march. Exactly, so they go cc hello.c; a.out and get Illegal instruction unless they have a relatively new first-world PC. , which doesn't require them to recompile the compiler Neither does compiling for i386/i486 or armv4 if you have a cross-compiler for another default -- you can use -mcpu to downgrade too. Of course. However it does bite cross-compilers because people end up distributing the C library compiled for a high-end CPU, so no program will run even when you do drop the -mcpu level. Raising it instead still works for everyone. (**/me mumbles something incoherent about Pareto, etc...***) Moore's Law suggests that we should optimise most intensely for the physically slower processors, where sloth or speed translates into more real time, but I forgot that point in the last post :) Actually, this is irrelevant to the thread, since one always has to specify a CPU model in the tuple when configuring for i?86, and the thread was about an i686-* configuration tuple still producing a compiler that outputs i386 code by default, which does seem silly. Happy Sunday. M
Re: Change x86 default arch for 4.5?
On 2/21/10, Dave Korn dave.korn.cyg...@googlemail.com wrote: I too am having a hard time envisaging exactly who would be in this class of users who are simultaneously so naive that they don't know about -march or -mcpu or think to read the manual, and yet so advanced that they are trying to write programs for and rebuild modern compilers on ancient equipment Old equipment is retro in rich places, but we built the first public-access email lab from stuff found in skips in inner-city Sicily and were very glad that Slackware, Debian and so on ran on 386 and 486's. At that time everyone who was anyone had pentium MMXs at least. The point about defaults is that the GCC default tends to filter down into the default for distributions; if GCC had been following the 90% of people have Pentiums rule, and the distros followed the default, our low-budget lab would have been two terminals instead of about a dozen (or had to run SCO Xenix or something). I'm ex-UK-university computing lecturer myself, but been both rich and poor, both many times, so I know how the other half lives. Incidentally, one of the hackers who used Linux in that Sicilian squat at the age of 13 has just been accepted to do a computing degree at Cambridge University, UK. Not this this *still* has anything to do with the thread, but it's Sunday, so... Bless M
Re: Change x86 default arch for 4.5?
On 2/21/10, Dave Korn dave.korn.cyg...@googlemail.com wrote: On 21/02/2010 20:03, Martin Guy wrote: The point about defaults is that the GCC default tends to filter down into the default for distributions; I'd find it surprising if that was really the way it happens; don't distributions make deliberate and conscious decisions about binary standards and things like that? Changing the default without losing that compatability would assume that every distro (and there are hundreds of them) either already specifies a specific arch or that its GCC maintainer notices the change in GCC and adds explicit configuration options to revert the change. The big ones with dedicated maintainers for GCC probably already do that; others just configure and make the standard distro and take what comes. On 2/21/10, H.J. Lu hjl.to...@gmail.com wrote: There is nothing which stops them from using -march=i386. It just may not be the default. There is: the arch that the libraries in their distro were compiled to run on. On 2/21/10, Steven Bosscher stevenb@gmail.com wrote: On Sun, Feb 21, 2010 at 9:22 PM, Erik Trulsson ertr1...@student.uu.se wrote: One of the great advantages of much free/open software is the way it will work just fine even on older hardware. And, let's face it, most users of gcc don't use it because it is free software but because it performs just fine for them. And when it does not, they just as easily switch to another compiler. Hardly. At present there is a GCC monoculture, both in what is the standard compiler with most systems and in what compiler packages will build with, either because the build system uses GCC-specific flags or because the code using GCC extensions. On 2/21/10, Steven Bosscher stevenb@gmail.com wrote: Which brings us back to the discussion of satisfying the needs of a tiny minority while hurting the vast majority of users. There's a difference in quality between the two. The hurt is that powerful modern PCs might take 20% longer to encode a DVD, while the needs is that the bulk of software will run at all on their poor hardware. It's usual in modern societies to give priority to enabling the underprivileged to function at all over giving the well-off the maximum of comfort and speed, but how you value the two aspects probably depends on your personal experience of the two realities. On 2/21/10, Dave Korn dave.korn.cyg...@googlemail.com wrote: On 21/02/2010 21:53, Steven Bosscher wrote: Yes, of course -- but what is the advantage of using the latest GCC for such an older processor? Tree-SSA? LTO? Fixed bugs? New languages? Etc? I can see plenty of good reasons for it. Apart from those factors (and one hopes that in general all code generation improves from release to release), users may not really have a choice, being most likely to try (or be given) the most recent stable version of whatever distro, and distros tend to try to ship the most recent stable gcc in each new release. Let me add another example from my own experience: In 2001 I was stuck for months in a crumbling house in the countryside with nothing but an 8MB 25MHz 386 because that's all l I had available at the time (green screen, yay!) and I completed what would have been my postgraduate degree project, begun in 1985: an unlimited precision floating point math library in a pure functional language. The fact that I could do that at all may be due to GCC's work on the minimum policy of the time, both in the distro and on whatever machine David Turner used to compile the binary-only release of the Miranda interpreter. If I recall correctly, the default is currently arched and tuned for 486, and the 386's lacks are trapped and emulated in the kernel. On 2/21/10, Steven Bosscher stevenb@gmail.com wrote: Well, as Martin already pointed out (contradicting his own point): Apparently a lot of distributions *do* change the defaults. That's OK, I don't have The Truth in my pocket. Nor do I have any quantifiable measure of the number of different systems in use in the whole world, just a value judgement based on a different set of experiences of the outcome of restrictive and generous policies in munimum CPU targetting, which I'm sharing. My direct experience is that low-end PCs are widely used in societies where things are hard, and that upstream software developers are always given the latest, fastest computers to make them more productive and are unaware of the struggling masses :) Cheers M
Re: Are pointers to be supposed to be sign or zero extended to wider integers?
On 2/12/10, Richard Guenther richard.guent...@gmail.com wrote: On Fri, Feb 12, 2010 at 10:41 AM, Jakub Jelinek ja...@redhat.com wrote: It seems pointers are sign extended to wider integers, is that intentional? Your program prints zero-extends for ICC. Probably the behavior is undefined and we get a warning anyway: All C requires is that casting a opinter to an integer and back again should be a no-op, so either behaviour should work, although there's a more detailed explanation here with references to the language standards: http://gcc.gnu.org/onlinedocs/gcc/Arrays-and-pointers-implementation.html To address your specific point, it says: extends according to the signedness of the integer type if the pointer representation is larger than the integer type M
Re: powerpc-eabi-gcc no implicit FPU usage
On 1/16/10, David Edelsohn dje@gmail.com wrote: Is there a way to get GCC to only use the FPU when we explicitly want to use it (i.e. when we use doubles/floats)? Is -msoft-float my only option here? Is there any sort of #pragma that could do the same thing as -msoft-float (I didn't see one)? To absolutely prevent use of FPRs, one must use -msoft-float. The hard-float and soft-float ABIs are incompatible and one cannot mix object files. There is a third option -mfloat-abi=softfp which stipulates that FP instructions can be used within functions but the parameter and return values are passed using the same conventions as soft float. soft and softfp-compiled files can be linked together, allowing you to mix code using FP instructions and not with source file granularity. I dunno if that affects the use of FP registers to load /store 64-bit integer values as you originally described, but may be the closest you can get without modifying GCC to insert new #pragmas. M
Re: How to implement pattens with more that 30 alternatives
On 12/22/09, Daniel Jacobowitz d...@false.org wrote: in a patch I'm working on for ARM cmpdi patterns, I ended up needing cmpdi_lhs_operand and cmpdi_rhs_operand predicates because Cirrus and VFP targets accept different constants. Automatically generating that would be a bit excessive though. I wouldn't bother implementaing that if the VFP/Cirrus conflict is the only thing that needs that. GCC's has never been able to generate working code for Cirrus MaverickCrunch for over a dozen separate reasons, from incorrect use of the way the Maverick sets the condition codes to hardware bugs in the 64-bit instructions (or in the way GCC uses them). I eventually cooked up over a dozen patches to make 4.[23] generate reliable crunch floating point code but if you enable the 64-bit insns it still fails the openssl testsuite. M
Re: How to implement pattens with more that 30 alternatives
On 12/22/09, Daniel Jacobowitz d...@false.org wrote: Interesting, I knew you had a lot of Cirrus patches but I didn't realize the state of the checked-in code was so bad. Is what's there useful or actively harmful? Neither useful nor harmful except in that it adds noise to the arm backend. It's useful if you want to get a working compiler by applying my patches... The basic insn description is ok but the algorithms to use the insns are defective; I suppose it's passively harmful since until it's fixed it just adds noise and size to the arm backend. I did the copyright assignment thing but I haven't mainlined the code, partly because it currently has an embarassing -mcirrus-di flag to enable the imperfect 64-bit int support, partly out of laziness (the dejagnu testsuite for all insns it can generate and for the more interesting resolved bugs). Maybe one day... M
Re: GCC 4..4.x speed regression - help?
Yes, GCC is bigger and slower and for several architectures generates bigger, slower code with every release, though saying so won't make you very popular on this list! :) One theory is that there are now so many different optimization passes (and, worse, clever case-specific hacks hidden in the backends) that the interaction between the lot of them is now chaotic. Selecting optimization flags by hand is no longer humanly possible. There is a project to untangle the mess: Grigori Fursin's MILEPOST GCC at http://ctuning.org/wiki/index.php/CTools:MilepostGCC - an AI-based attempt to autmatically select combinations of GCC optimization flags according to their measured effectiveness and a profile of you source code's characteristics. The idea is fairly repulsive but effective - it reports major speed gains of the order of twice as fast compared to the standard fastest -O options, and there is Google Summer of Code 2009 project based on this work. It seems to me that much over-hacked software lives a life cycle much like the human one: infancy, adolescence, adulthood, middle-age (spot the spread!) and ultimately old age and senility, exhibiting characteristics at each stage akin to the mental faculties of a person. If you're serious about speed, you could try MILEPOST GCC, or try the current up-and-coming adolescent open source compiler, LLVM at llvm.org M
How to make ARM-MaverickCrunch register transfers schedulable?
Hi! I'd appreciate some input on how to get the pipeline scheduler to know about the bizarre MaverickCrunch timing characteristics. Brief: Crunch is an asynchronous ARM coprocessor which has internal operations from/to its own register set, transfers between its own registers and the ARM integer registers, and transfers directly to/from memory. Softfp is the current favourite ABI, where double arguments are passed in ARM register pairs, same as softfloat, and a typical double float function transfers its arguments from ARM registers to the FPU, does some munging between the FPU registers, then transfers the result back to ARM regs for the return(). It has to do this 32 bits at a time: double adddf(double a, double b) {return (a+b);} adddf: cfmv64lrmvdx0, r0 cfmv64hrmvdx0, r1 cfmv64lrmvdx1, r2 cfmv64hrmvdx1, r3 cfaddd mvdx1, mvdx1, mvdx0 cfmvr64lr0, mvdx1 cfmvr64hr1, mvdx1 bx lr Although you can do one transfer per cycle between the two units, two consecutive transfers to the same Crunch register incur a delay of four cycles, so each transfers to crunch registers takes 4 cycles. A better sequence would be: cfmv64lrmvdx0, r0 cfmv64lrmvdx1, r2 cfmv64hrmvdx0, r1 cfmv64hrmvdx1, r3 My questions are two: - can I model the fact that two consecutive writes to the same register have a latency of four cycles (whereas writes to different registers can be one per cycle)? - am I right in thinking to define two new register modes, MAVHI and MAVLO for the two kinds of writes to the maverick registers, then turn the movdf (and movdi) definitions for moves to/from ARM registers into define_split's using the two new modes? Thanks, sorry it's a bit osbcure! M An expert is someone who knows more and more about less and less:
Re: Anyone else run ACATS on ARM?
On 8/12/09, Matthias Klose d...@debian.org wrote: On 12.08.2009 23:07, Martin Guy wrote: I looked into gnat-arm for the new Debian port and the conclusion was that it has never been bootstrapped onto ARM. The closest I have seen is Adacore's GNATPro x86-xscale cross-compiler hosted on Windows and targetting Nucleus OS (gak!) is there any arm-linx-gnueabi gnat binary that could be used to bootstrap an initial gnat-4.4 package for debian? No, unless someone has done this since 2007. It involved cross-compiling to generate a native gnat compiler, but that was not a priority for ARM Ltd when I was working on this. M
Re: Anyone else run ACATS on ARM?
On 8/12/09, Joel Sherrill joel.sherr...@oarcorp.com wrote: So any ACATS results from any other ARM target would be appreciated. I looked into gnat-arm for the new Debian port and the conclusion was that it has never been bootstrapped onto ARM. The closest I have seen is Adacore's GNATPro x86-xscale cross-compiler hosted on Windows and targetting Nucleus OS (gak!) The community feeling was that it would just go given a prodigal burst of cross-compiling, but I never got achieved sufficiently high blood pressure to try it... M
Re: putc vs. fputc
On 7/24/09, Uros Bizjak ubiz...@gmail.com wrote: The source of gcc uses both, fputc and putc. I would like to do some janitorial work and change fputc to putc. putc and fputc have different semantics: fputc is guaranteed to be a function while putc may be a macro. M He who has nothing to do, combs dogs - Sicilian saying
Re: Machine Description Template?
On 6/5/09, Graham Reitz grahamre...@gmail.com wrote: I have been working through sections 16 17 of the gccint.info document and also read through Hans' 'Porting GCC for Dunces'. There is also Incremental Machine Descriptions for GCC http://www.cse.iitb.ac.in/~uday/soft-copies/incrementalMD.pdf which describes creation of a new, clean machine description from scratch M
Re: help for arm avr bfin cris frv h8300 m68k mcore mmix pdp11 rs6000 sh vax
On 3/14/09, Paolo Bonzini bonz...@gnu.org wrote: Hans-Peter Nilsson wrote: The answer to the question is no, but I'd guess the more useful answer is yes, for different definitions of truncate. Ok, after my patches you will be able to teach GCC about this definition of truncate. I expect it's a bit too extreme an example, but I've just found (to my horror) that the MaverickCrunch FPU truncates all its shift counts to 6-bit signed (-32(right) to +31(left)), including on 64-bit integers, which is not very helpful to compile for. ...unless it happens to come easy to handle shift count is truncated to less than size of word in your new framework M
Re: help for arm avr bfin cris frv h8300 m68k mcore mmix pdp11 rs6000 sh vax
On 3/16/09, Paolo Bonzini bonz...@gnu.org wrote: AND R1, R0, #31 MOV R2, R2, SHIFT R1 ANDS R1, R0, #32 MOVNE R2, R2, SHIFT #31 MOVNE R2, R2, SHIFT #1 or ANDS R1, R0, #32 MOVNE R2, R2, SHIFT #-32 SUB R1, R1, R0 ; R1 = (x = 32 ? 32 - x : -x) MOV R2, R2, SHIFT R1 Thanks for the tips. Yes, I was contemplating cooking up something like that, hobbled by the fact that if you use maverick instructions conditionally you either have to put seven nops either side of them or risk death by astonishment. M
Re: GCC 4.4.0 Status Report (2008-11-27)
On 12/9/08, Joel Sherrill [EMAIL PROTECTED] wrote: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37440 Can Ada build on any Arm platform? The only existing GNAT Ada compiler I could find for ARM (while thinking about doing it for the new Debian eabi port) is Adacore's Windows-Nucleus OS crosscompiler for Xscale CPUs, though they don't say what version of GCC they use. Then my funding ran out. My impression at the time, and that of the Debian Ada maintainer, was that it is a case of no one ever having made the effort to cross-bootstrap a native linux compiler but that it should just go. If anyone cares enough I'm open to offers to try it... M
ARM machine description: how are pool_ranges calculated
Hi! I'd appreciate help with my learner's questions about GCC machine descriptions, about the ARM code generator. I'm trying to fix code generation for the Cirrus MaverickCrunch FPU by trying to understand several sets of patches, figure out which are bogus which are buggy and which need reimplementing, and to distill the essence of them into one proper set, but each day I'm ending up confused by too many things about MDs that I am not certain of, so some help would be appreciated. On with the first question... ARM machine description and pool ranges: How should the value in the pool_range and neg_pool_range attributes be calculated? What precisely do the values represent? Here's how far I got: In the machine description, the pool_range and neg_pool_range attributes tell how far ahead or behind the current instruction a constant can be loaded relative to the current instruction. The most common values are: - a sign bit and a 12-bit byte offset for ARM load insns (+/- 0 to 4095 bytes, max usable of 4092 for 4-byte-aligned words) - a sign bit and an 8-bit word offset for Maverick and VFP load insns (+/- 0 to 1020 bytes) - other ranges for thumb instructions and iwmmxt, depending on insn and addressing mode When the offsets stored in the instructions are used, they refer to offsets from the address of the instruction (IA) plus 8 bytes. Are the pool_ranges also calculated from IA+8, from the address of the instruction itself or even from the address of the following instruction (IA+4)? In the md, the most common pairs of values are (-4084, +4096) (-1008, +1020) but there several other values in use for no obvious reason: +4092, -1004, -1012, +1024 The +4096 (4092) suggests that they are not the values as encoded in the instruction, but are offset by at least 4. The full useful ranges offset by 8 would give (-4084, +4100) (-1016, +1028) I can't find a mathematically explicit comment about it, and can't make sense of the values. In practice, by compiling autogenerated test programs and objdumping them -d: 32-bit integer constants use from [pc, #-4092] to [pc, #4084] 64-bit constants in pairs of ARM registers use from [pc, #-3072] to [pc, #3068] (??) Alternating 32- and 64-bit constants use from [pc, #-3412] to [pc, #3404] (???) 64-bit doubles in Maverick registers use from [pc, #-1020] to [pc, #1008] (these are the exact values specified in the attributes fields of cirrus.md for the cfldrd insn, without any IA+8 adjustment!) Two non-issues - 64-bit alignment requirement for 64-bit quantities in EABI is not applied to the constant pools - 64-bit data is 32-bit aligned there, so no allowance of a possible extra 4 bytes for alignment is necessary. - the -mcirrus-fix-invalid-insns flag, which peppers the output with NOPs, causes no problems since the constant pool calculations are done after the NOP-insertion. Hoping I haven't just failed to spot some large and obvious comment... M
Re: What to do with hardware exception (unaligned access) ? ARM920T processor
On 10/1/08, Vladimir Sterjantov [EMAIL PROTECTED] wrote: Processor ARM920T, chip Atmel at91rm9200. char c[30]; unsigned short *pN = c[1]; *pN = 0x1234; Accesses to shorts on ARM need to be aligned to an even address, and longs to a 4-byte address. Otherwise the access returns (eg, for a 4-byte word pointer) is *(p ~3) *(p 3) (where is byte rotate, not bit shift). Or causes a memory fault, if that's how your system is configured. If you don't want to make the code portable and your are running a recent Linux, a fast fix is to echo 2 /proc/cpu/alignment which should make the kernel trap misaligned accesses and fix them up for you, with a loss in performance of course. The real answer is to fix the code... M
Re: GCC 4.2.2 arm-linux-gnueabi: c++ exceptions handling?
On 9/26/08, Sergei Poselenov [EMAIL PROTECTED] wrote: Hello all, I've built the above cross-compiler and ran the GCC testsuite. Noted a lot of c++ tests failed with the same output: ... terminate called after throwing an instance of 'int' terminate called recursively Are you configuring cross glibc with --disable-libunwind-exceptions? This has been necessary for all ARM EABI cross-compilers I've built so far. Could someone having the 4.2 release series compiler configured for ARM EABI target try this simple test: I just tried it with the native Debian ARM EABI compiler: gcc-4.2.4, binutils-2.18.0.20080103, glibc-2.7 and it silently exits(0). FWIW, their g++-4.2 is also configured with explicit --disable-sjlj-exceptions, although that seems to be the default. M
Re: Division using FMAC, reciprocal estimates and Newton-Raphson - eg ia64, rs6000, SSE, ARM MaverickCrunch?
On 5/9/08, Paolo Bonzini [EMAIL PROTECTED] wrote: The idea is to use integer arithmetic to compute the right exponent, and the lookup table to estimate the mantissa. I used something like this for square root: 1) shift the entire FP number by 1 to the right (logical right shift) 2) sum 0x2000 so that the exponent is still offset by 64 3) extract the 8 bits from 14 to 22 and look them up in a 256-entry, 32-bit table 4) sum the value (as a 32-bit integer!) with the content of the table 5) perform 2 Newton-Raphson iterations as necessary It normally turns out to be faster to use the magic integer sqrt algorithm, even when you have multiplication and division in hardware unsigned long isqrt(x) unsigned long x; { register unsigned long op, res, one; op = x; res = 0; /* one starts at the highest power of four = than the argument. */ one = 1 30; /* second-to-top bit set */ while (one op) one = 2; while (one != 0) { if (op = res + one) { op = op - (res + one); res = res + 2 * one; } res = 1; one = 2; } return(res); } The current soft-fp routine in libm seems to use a variant of this, but of course it may be faster if implemented using the Maverick's 64-bit add/sub/cmp. M
Re: Best version of gnat-4.X port to start a port to arm eabi?
Many thanks for the input. On 5/2/08, Joel Sherrill [EMAIL PROTECTED] wrote: Do you mean the gcc target is arm-eabi? As well as the host - I need to end up with a native Ada compiler running on arm-linux-gnueabi. On 5/1/08, Laurent GUERBY [EMAIL PROTECTED] wrote: http://www.rtems.com/wiki/index.php/RTEMSAda Fab! I haven't quite gotten skyeye to the point I trust running testsuites on it completely automated Aah, skyeye! I've been building and testing on qemu-arm-system since 2006 and it's been rock-solid. The main issue for Ada with respect to other GCC languages is the lack of support of multilibs. Fortunately not needed on Debian arm. I don't think you need canadian cross, in the old times there were targets in the Ada Makefile to help moving from a cross to a native compiler. Interesting, I'll have a look. I had been thinking to build a regular x-compiler and to use it to cross-compile the ada compiler, but thinking on't, it should be possible to generate it in one canadian (or cross-native) build. Does that sound a reasonable expectation? I don't think that is necessary. arm-eabi should be very close to working (with newlib as the C library). Good. However the environment is a given: Debian hence glibc. If you want to compile on a bi-quad Xeon at 3GHz with 16GB of RAM (and many other machines) running debian you can apply for an account on the GCC Compile farm: Thanks. Incidentally, there's a publicly-accessible 600MHz 512MB ARM box here running arm-linux-gnueabi Debian, on which anyone wanting to do ARM testing/dev is welcome to an account. longer than the bible... Sorry, that was a quote from The Song of Hakawatha, worth googling if you don't know it and fancy a geeky chuckle. M
Best version of gnat-4.X port to start a port to arm eabi?
Hi! I'm about to lower the gangplanks to get a native gnat on ARM EABI through an unholy succession of cross-compilers, with the object of getting gnat-4.1 4.2 and 4.3 into the new Debian port for ARM EABI. The only arm-targetted gnat I could find is adacore's Windows cross-compiler for xscale (gag retch) but at least that suggests that it's possible, and the Debian ADA person made optimistic noises when I asked, but I thought I'd better consult the oracle first :) I've seen the recommendation about using the same version of gnat as the version you're cross-compiling, and I gather that each version will natively compile later versions ok, but maybe not the other way round, so I'm assuming that I need to use an existing x86-native gnat/gcc to make x86-arm-cross of the same version, then use that canadianly to make arm-native, then use that to build the debian-gnat package or the same and later versions. At the moment I am assuming to start with 4.1 to get all 3, but I know that gcj only works on ARM EABI from 4.3, and C++ still has problems with exceptions (try-catch) on EABI, maybe less so in later versions (?) So, before I set out on the journey, does anyone know of gnat-reasons or ARM EABI-reasons that would make it wiser to start with a later version than 4.1? I confess I know little about Ada except that it has a formal syntax longer than the bible... Thanks M
Re: [linux-cirrus] Re: wot to do with the Maverick Crunch patches?
The company I work for is about to release a board to PCB fab with a Cirrus part on it. If this is the case we may want to hold back on the release and switch ARM parts. If it's the EP93xx, you'd be well-advised to do so; I gather there is one similar competitor that doesn't waste silicon on a broken FPU, a display engine that can only do up to 800x600x16 or 1024x768x8 without getting jumpy (2.6.2X fbdev), and a raster graphic operations unit that appears to be slower than doing the corresponding bitops in ARM software. Don't get me wrong, the thing still bristles with peripherals and delievers lots of poke for sexto to no energy, and we are working on making the most of what we have. Has anyone tried the NetBSD armevb port on an ep93XX and added the frame driver patch I've seen lurking around? Could its frame buffer do stable higher-res full-colour graphics? The Linux one does them but the frame jitters about, as if the VDU is being locked out of the RAM for too long. I guess we'll go after our supplier as well to see what availability on the existing parts will be like Well, leave some for us :) No, it's still a solid chip that runs for hundreds of days without a blip and barely gets warm, so I wouldn't redesign unless you wanted those specific features or are early enough in the design cycle. M
wot to do with the Maverick Crunch patches?
Ok, so we all have dozens of these EP93xx ARM SoCs on cheap boards, with unusable floating point hardware. What do we have to do to get the best-working GCC support for Maverick Crunch FPU? Suggest: make open-source project with objective:.to get the best-working GCC support for Maverick Crunch FPU. Anyone wanna run one, create repositories, set up mailing list etc a la producingoss.com, or is the current infrastructure sufficient for a coordinated effort? Host the sets of patches under savannah.gnu.org and endeavour to unite them? Do we have a wiki for it, other than debian's ArmEabiPort and the wikipedia? As I understand it, mailline GCC with patches in various versions can give: futaris-4.1.2/-4.2.0: Can usually use floating point in hardware for C and C++, maybe problems with exception unwinding in C++. In generated ASM code, all conditional execution of instructions is disabled except for jump/branch. Loss of code speed/size: negligable. Passes most FP tests but does not produce a fully working glibc (I gather from the Maverick OpenEmbedded people) cirrus-latest: Conditional instructions are enabled but you can still get inaccurate or junk results at runtime due to timing bugs in FP hardware triggered by certain types of instructions being a certain distance apart at runtime. Does not pass all floating point math verification tests either, but does worse than futaris. Cirrus also have a hand-coded Maverick libm that you can link with old-ABI binaries - can we incorporate this asm code in mainline? Thoughts on a postcard please... any further progress in OE land? M
Re: [linux-cirrus] Re: wot to do with the Maverick Crunch patches?
On 3/30/08, Brian Austin [EMAIL PROTECTED] wrote: I am now doing Linux ALSA/SoC work for our low power audio codecs. Good luck, look forward to using them... :) I have been given the freedom with this new position to allow access to this machine for outside people to contribute whatever works they would like. I can add a wiki, or whatever ya'll want if you wish to use our hardware and pipeline for WWW things. I also have GIT, BugZilla, and some other stuff. What URL is to be its home page?... M
Re: Benchmarks: 7z, bzip2 gzip.
2008/2/29, J.C. Pizarro [EMAIL PROTECTED]: Here are the results of benchmarks of 3 compressors: 7z, bzip2 and gzip, and GCCs 3.4.6, 4.1.3-20080225, 4.2.4-20080227, 4.3.0-20080228 4.4.0-20080222. Thanks, that's very interesting. I had noticed 4.2 producing 10% larger and 10% slower code for a sample code fragment for ARM but couldn't follow it up. Is there a clause in regressions for takes longer to compile and produces worse code? M
Re: Contributing to cross-compiling
2008/1/31, Manuel López-Ibáñez [EMAIL PROTECTED]: Nonetheless, if someone decided to go through the hassle of collecting tutorials and hints for various cross-compiling configurations in the wiki, I think many users will appreciate it. It is still considered by many to be a dark art[*]. The crosstool project http://kegel.com/crosstool is a humungous shell script with configuration files that has collected a lot of the community wisdom over the years about the necessary runes to build cross-compilers for different scenarios and with different target-cpu/gcc/glibc/OS combinations. There is also a menu-driven spin-off project, crosstool-ng, which is less mature but embodies the same set of knowledge. M