Re: [gentoo-dev] Re: LTO use in the tree
On Sat, Apr 26, 2014 at 10:37 PM, C. Bergström cbergst...@pathscale.com wrote: #2 The only reference to anything which the compiler could impact is Use Boyer-Moore (and unroll its inner loop a few times). Finding out which flag controls that for ${CC} would have some importance. It's almost certainly combined with -O3 and or some standalone loop related optimization. (Nothing depending on LTO). If they were really clever or determined - there's probably a few GCC or other pragma which could give a hint about unrolling. So, I'll certainly agree that package-specific CFLAG tuning will always be superior to just setting some flag at the system level and walking away. And yet, in the same paragraph you mention -O3, which is tantamount to just setting a flag and walking away. That turns on 14 things you probably don't really need. I run -flto at the system level since in my experience it only causes problems with a handful of packages, and when it does provide a benefit I get it. For the most part it just means my compiles at 2AM take longer, and a bit more RAM, neither of which are a concern. If I do run into a bug, that is just an opportunity to log it and contribute (though to date I haven't been submitting -flto issues as bugs as it is still a bit new). I think LTO is becoming mainstream-enough that we should consider it supported in the sense that packages should filter it if it is known not to work. We certainly do that with things like -O2/3/s if they don't work. However, it still should be considered a somewhat experimental flag and enabling it will involve bumps. Also, it will always involve a RAM tradeoff, so there may be cases where it isn't filtered because it does work just fine, but it won't work for your system with 4GB of RAM (or 8, or 16 even). If maintainers want to add logic to test before building (as is sometimes done for /var/tmp with very large packages) they are welcome to do so, but I think that is going above-and-beyond. Rich
Re: [gentoo-dev] Re: LTO use in the tree
On 04/27/14 06:23 PM, Rich Freeman wrote: On Sat, Apr 26, 2014 at 10:37 PM, C. Bergström cbergst...@pathscale.com wrote: #2 The only reference to anything which the compiler could impact is Use Boyer-Moore (and unroll its inner loop a few times). Finding out which flag controls that for ${CC} would have some importance. It's almost certainly combined with -O3 and or some standalone loop related optimization. (Nothing depending on LTO). If they were really clever or determined - there's probably a few GCC or other pragma which could give a hint about unrolling. So, I'll certainly agree that package-specific CFLAG tuning will always be superior to just setting some flag at the system level and walking away. And yet, in the same paragraph you mention -O3, which is tantamount to just setting a flag and walking away. That turns on 14 things you probably don't really need. I was trying to give a simplified example... no need to nitpick my reply (Every compiler defines -O3 differently and even the flag to unroll loops and that threshold may be different.. ...) I run -flto at the system level since in my experience it only causes problems with a handful of packages, and when it does provide a benefit I get it. Can you name a single package that you use which receives a measurable benefit from LTO? (Just asking) I don't disagree about enabling it, filing bug reports or many other things. I'm just curious if you have any hard numbers... (You seem passionate and sorry if this seems like I'm putting you on the spot) /* Side note IPA (aka whole program and LTO) is by far the hardest optimizations I've ever personally had to debug/engineer/tune in a compiler. Making it robust needs passionate users who file good reduced test cases. While for a single source you have creduce or delta - what options are there for automated reduction of whole program problems.. */
Re: [gentoo-dev] Re: LTO use in the tree
On Sun, Apr 27, 2014 at 7:41 AM, C. Bergström cbergst...@pathscale.com wrote: On 04/27/14 06:23 PM, Rich Freeman wrote: And yet, in the same paragraph you mention -O3, which is tantamount to just setting a flag and walking away. That turns on 14 things you probably don't really need. I was trying to give a simplified example... no need to nitpick my reply (Every compiler defines -O3 differently and even the flag to unroll loops and that threshold may be different.. ...) Sorry if it came across aggressively. I was just pointing out that the reason one sets CFLAGs generically is to avoid the trouble of optimizing the optimizer. This always comes at a cost - I tend to use -Os, but no doubt some packages would benefit from a different global optimization, let alone specific optimizations. That was just the point I wanted to make about LTO - I think it is of general usefulness since it has the potential to help, and rarely hurts. The only problem with it is that the implementation is immature. Can you name a single package that you use which receives a measurable benefit from LTO? (Just asking) Alas, I cannot. There are some general benchmarks out there, and they seem to vary from little to no effect to significant. More CPU-intensive software seems the most likely to benefit. No doubt the benefits of LTO will improve as it matures. Rich
Re: [gentoo-dev] Re: LTO use in the tree
On 04/27/2014 07:23, Rich Freeman wrote: On Sat, Apr 26, 2014 at 10:37 PM, C. Bergström cbergst...@pathscale.com wrote: #2 The only reference to anything which the compiler could impact is Use Boyer-Moore (and unroll its inner loop a few times). Finding out which flag controls that for ${CC} would have some importance. It's almost certainly combined with -O3 and or some standalone loop related optimization. (Nothing depending on LTO). If they were really clever or determined - there's probably a few GCC or other pragma which could give a hint about unrolling. So, I'll certainly agree that package-specific CFLAG tuning will always be superior to just setting some flag at the system level and walking away. And yet, in the same paragraph you mention -O3, which is tantamount to just setting a flag and walking away. That turns on 14 things you probably don't really need. I run -flto at the system level since in my experience it only causes problems with a handful of packages, and when it does provide a benefit I get it. For the most part it just means my compiles at 2AM take longer, and a bit more RAM, neither of which are a concern. If I do run into a bug, that is just an opportunity to log it and contribute (though to date I haven't been submitting -flto issues as bugs as it is still a bit new). My curiosity, as I have not attempted LTO yet on any machine, is what are the RAM requirements? Is it a hard limit, wherein the compiler simply fails if there isn't enough RAM, or does it just start hitting swap real hard? Those of us using older archs where the RAM is limited might have to be more cautious w/ LTO. I.e., my SGI O2 maxes right now at 512MB. It can go to 1GB if the odd memory/PROM issue is ever worked out. But 512MB is it for now, so what are my odds of successfully using LTO on that? Especially if LTO helps to reduce the final binary size, that's less data being shuffled around main memory and the CPU caches, which, although means slower compile times, might hake such a machine a bit snippier. Though, I dread how long GCC will take to build itself w/ LTO. The O2 already needs ~18hrs for 4.8. I haven't tried 4.9 on it yet. -- Joshua Kinard Gentoo/MIPS ku...@gentoo.org 4096R/D25D95E3 2011-03-28 The past tempts us, the present confuses us, the future frightens us. And our lives slip away, moment by moment, lost in that vast, terrible in-between. --Emperor Turhan, Centauri Republic
[gentoo-dev] Last rites: dev-python/python-gnutls
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 # Manuel Rüger mr...@gentoo.org (28 Apr 2014) # Fails to build with gnutls-3, on behalf of python herd # See bug #446016 dev-python/python-gnutls -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQJ8BAEBCgBmBQJTXYr2XxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXQ4MDA1RERERkM0ODM2QkE4MEY3NzY0N0M1 OEZCQTM2QzhEOUQ2MzVDAAoJEFj7o2yNnWNc7TEP/RbgarQwrfyKCVOIESMJNccl KNs1TR27Re8r4epZwclXGg9tcU++wSGCLph9uHjrJfPv6cla9m5MwxXXpXnuMRHo QiGxKP2vM1663m/+wz6TrUSUzLglp1lvGKXX+pEweKoY5sY2yWiWKEQXOq5KL6q4 iEQLLWX3tvxF8aoE+Qy1nggSHym2wJc8S27bdD8P8GSmoIdCiVTesp5FYKxfryrB Yt9U3sdH3Qa2HGJkIkI1qdaUHTjjK+XAsI24iMd4iGN8CuDzkubiOid8e1gq1R14 ytmqt8IiXnJIz9MdwQMn7DE6NhSNY7asFuTuwed+oJQRdK5CiejUq2fYe0FoPibv uMsvq9xGmXzPXjwqg0yOca56EkengH7DF45LE+S3xwToFgxOmqXOKS5XsqKJ36nI 1fsQbeZeDAXGPFrncRgiCW1HlG4ZFrEmqSrsDzqpiQlVOlWw+EnqOePN5RD1pnJy zhUS6XZscbhOo/JjPLbr9BtwjWzQ+NggDbDG1wokhQocuyBASgB7WGP3Lc8w2NiA BqM2crQm9n/D2yD2j2mgB8UsZ5Ox+CwhqZbq0rO5q91o0mD48xlfXyD2Xkzt4Y5R dH4fXMTcHFtWnffPRFDMwcwojfIsEpCADP4wzCPjVMbZWC/Ipl2JxsjsmkoX+FHF kvzBeFIf/xTzJZfUBtje =N+7f -END PGP SIGNATURE-
Re: [gentoo-dev] Re: LTO use in the tree
On 04/26/2014 20:34, C. Bergström wrote: On 04/27/14 02:58 AM, Martin Vaeth wrote: Rich Freeman ri...@gentoo.org wrote: FWIW the list of packages I have issues with include: Not sure whether this is the right place to post it. It's interesting to see that rather lengthy list. From a compiler engineer perspective I'd like to toss in my opinion [snip] What compiler, out of curiosity? -- Joshua Kinard Gentoo/MIPS ku...@gentoo.org 4096R/D25D95E3 2011-03-28 The past tempts us, the present confuses us, the future frightens us. And our lives slip away, moment by moment, lost in that vast, terrible in-between. --Emperor Turhan, Centauri Republic
Re: [gentoo-dev] Re: LTO use in the tree
On Sun, Apr 27, 2014 at 6:56 PM, Joshua Kinard ku...@gentoo.org wrote: My curiosity, as I have not attempted LTO yet on any machine, is what are the RAM requirements? Is it a hard limit, wherein the compiler simply fails if there isn't enough RAM, or does it just start hitting swap real hard? It just allocates RAM, and the OS does the rest. I've seen it invoke the OOM killer. That was back when I only had 8GB of RAM. Now I have 16GB and I only need to disable LTO on the really big packages. Of course, if you set an appropriate ulimit then the process will just terminate more gracefully. I'd highly recommend doing just that if you have a lot of swap available. Those of us using older archs where the RAM is limited might have to be more cautious w/ LTO. I.e., my SGI O2 maxes right now at 512MB. It can go to 1GB if the odd memory/PROM issue is ever worked out. But 512MB is it for now, so what are my odds of successfully using LTO on that? About zero. Well, I'm sure it will work fine for hello.c, especially if you eliminate any function calls inside of it. Especially if LTO helps to reduce the final binary size, that's less data being shuffled around main memory and the CPU caches, which, although means slower compile times, might hake such a machine a bit snippier. Though, I dread how long GCC will take to build itself w/ LTO. The O2 already needs ~18hrs for 4.8. I haven't tried 4.9 on it yet. Yeah, good luck with that... :) I'd be curious as to what you find. You can always try it out by picking a small package and doing a CFLAGS=foo emerge bar. Be sure to only use -j1 -flto=1 as well. Rich
Re: [gentoo-dev] Re: LTO use in the tree
On 04/27/2014 19:08, Rich Freeman wrote: On Sun, Apr 27, 2014 at 6:56 PM, Joshua Kinard ku...@gentoo.org wrote: My curiosity, as I have not attempted LTO yet on any machine, is what are the RAM requirements? Is it a hard limit, wherein the compiler simply fails if there isn't enough RAM, or does it just start hitting swap real hard? It just allocates RAM, and the OS does the rest. I've seen it invoke the OOM killer. That was back when I only had 8GB of RAM. Now I have 16GB and I only need to disable LTO on the really big packages. Of course, if you set an appropriate ulimit then the process will just terminate more gracefully. I'd highly recommend doing just that if you have a lot of swap available. My favourite, starting long compiles on slow boxen, only to wake up to discover they failed in the final five minutes of the build over something as trite as low memory :) Those of us using older archs where the RAM is limited might have to be more cautious w/ LTO. I.e., my SGI O2 maxes right now at 512MB. It can go to 1GB if the odd memory/PROM issue is ever worked out. But 512MB is it for now, so what are my odds of successfully using LTO on that? About zero. Well, I'm sure it will work fine for hello.c, especially if you eliminate any function calls inside of it. About zero? So, some floating point value infinitely between 0 and 1? Hmm, maybe I'll try it once I get my SGI Octane to boot Linux again. Especially if LTO helps to reduce the final binary size, that's less data being shuffled around main memory and the CPU caches, which, although means slower compile times, might hake such a machine a bit snippier. Though, I dread how long GCC will take to build itself w/ LTO. The O2 already needs ~18hrs for 4.8. I haven't tried 4.9 on it yet. Yeah, good luck with that... :) I'd be curious as to what you find. You can always try it out by picking a small package and doing a CFLAGS=foo emerge bar. Be sure to only use -j1 -flto=1 as well. O2 only has one CPU, so it's always -j1. SMP on my other MIPS machines doesn't work yet (either Linux isn't supported, or I haven't debugged SMP code yet). -- Joshua Kinard Gentoo/MIPS ku...@gentoo.org 4096R/D25D95E3 2011-03-28 The past tempts us, the present confuses us, the future frightens us. And our lives slip away, moment by moment, lost in that vast, terrible in-between. --Emperor Turhan, Centauri Republic
Re: [gentoo-dev] Re: LTO use in the tree
On 04/27/2014 20:40, C. Bergström wrote: On those old SGI MIPS machines use MIPSPro. It had better (LTO/whole program) optimizations than GCC more than 10 years ago (imho and gcc may have caught up now in 4.9). Just add the -ipa flag and test. In fairness there is primarily 3 limitations with MIPSPro IPA [snip] That's if they ran IRIX. They run Linux :) -- Joshua Kinard Gentoo/MIPS ku...@gentoo.org 4096R/D25D95E3 2011-03-28 The past tempts us, the present confuses us, the future frightens us. And our lives slip away, moment by moment, lost in that vast, terrible in-between. --Emperor Turhan, Centauri Republic