Re: H1 2015 Priorities and Bare-Metal Programming
On 3 February 2015 at 11:15, Iain Buclaw ibuc...@gdcproject.org wrote: On 3 February 2015 at 08:28, Walter Bright via Digitalmars-d digitalmars-d@puremagic.com wrote: On 2/2/2015 8:36 PM, Daniel Murphy wrote: If so, what corrective action is the user faced with: The user can modify the code to allow it to be inlined. There are a huge number of constructs that cause dmd's inliner to completely give up. If a function _must_ be inlined, the compiler needs to give an error if it fails. I'd like to reexamine those assumptions, and do a little rewinding. The compiler offers a -inline switch, which will inline everything it can. Performance oriented code will use that switch. So why doesn't the compiler inline everything anyway? Because there's a downside - it can make code difficult to symbolically debug, and it makes for difficulties in getting good profile data. Manu was having a problem, though. He wanted inlining turned off globally so he could debug his code, but have it left on for a few functions where not inlining them would make the debug version too slow. pragma(inline,true) tells the compiler that this function is 'hot', and pragma(inline, false) that this function is 'cold'. Knowing the hot and cold paths enables the optimizer to do a better job. There are literally thousands of optimizations applied. Plucking exactly one out and elevating it to a do-or-die status, ignoring the other 999, is a false god. There's far more to a programmer reorganizing his code to make it run faster than just sprinkling it with forceinline pixie dust. There is a lot of value to telling the compiler where the hot and cold parts are, because those cannot be statically determined. But exactly how to achieve that goal really should be left up to the compiler implementer. Doing a better or worse job of that is a quality of implementation issue, not a language specification issue. Perhaps the fault here is calling it pragma(inline,true). Perhaps if it was pragma(hot) and pragma(cold) instead? pragma(hot/cold) or @attribute(hot/cold) This maps well in gdc's framework too. Also 'flatten' - which allows you to control inlining at the caller, rather than the callee.
Re: H1 2015 Priorities and Bare-Metal Programming
On 2/3/15 9:05 PM, Daniel Murphy wrote: Andrei Alexandrescu wrote in message news:mar39k$hvh$1...@digitalmars.com... I think the best route here - and the most in-the-spirit-of-D - is to provide introspection on whether a function is being inlined or not. Then we can always have in libraries: bool uart(ubyte b) { static assert(__traits(inlined), Inlining of uart() must be supported.); ... } That would require that inlining is done in the frontend, which is not acceptable. Yah, won't fly. Sorry for the distraction. -- Andrei
Re: H1 2015 Priorities and Bare-Metal Programming
Andrei Alexandrescu wrote in message news:mar39k$hvh$1...@digitalmars.com... I think the best route here - and the most in-the-spirit-of-D - is to provide introspection on whether a function is being inlined or not. Then we can always have in libraries: bool uart(ubyte b) { static assert(__traits(inlined), Inlining of uart() must be supported.); ... } That would require that inlining is done in the frontend, which is not acceptable.
Re: H1 2015 Priorities and Bare-Metal Programming
On Tuesday, 3 February 2015 at 22:41:30 UTC, Jonathan M Davis wrote: Well, as far as I can tell, that's pretty much exactly what Walter meant by hot and cold - how likely they are to be called, with the idea that the compiler could then use that information to better optimize - be it inlining or some other optimization. But hot does not imply inlining. You might want to tell the compiler that a function should use hot function call mechanics and never inline (for pragmatic reasons, like injecting breakpoints).
Re: H1 2015 Priorities and Bare-Metal Programming
On Tue, Feb 03, 2015 at 09:53:37PM +1100, Daniel Murphy via Digitalmars-d wrote: Walter Bright wrote in message news:maq7f1$2hka$1...@digitalmars.com... [...] It's like the old joke where a captain is asked by a colonel how he'd get a flagpole raised. The captain replied with a detailed set of instructions. The colonel said wrong answer, the correct response would be for the captain to say: Sergeant, get that flag pole raised! [...] We have inline assembler because sometimes being explicit is what's needed. I would consider using forceinline in the same situations where inline assembly is a viable option. eg interfacing with hardware, computation kernels Computation colonels? :-D T -- Having a smoking section in a restaurant is like having a peeing section in a swimming pool. -- Edward Burr
Re: H1 2015 Priorities and Bare-Metal Programming
On 2/3/15 12:17 AM, Johannes Pfau wrote: Well I see that you're not even considering adding a simple pragma to help embedded programming. In that case I see absolutely no reason to continue working on that. You guys say we lack expertise so we cannot help directly and you're in search of champions for these areas. But whenever somebody working with D on embedded systems actually comes up with an issue related to embedded programming and propose solutions you simply dismiss it. Often even based on vague statements like that's not a common task. http://wiki.dlang.org/Vision/2015H1 I think we need to work on better inlining. Which format (pragma vs. attribute etc) is just tactical detail. Clearly there needs to be best effort and won't compile unless it inlines directives. Johannes, please let us know whether this is everything needed to float your boat. I'm unclear whether you believe volatile data is needed or not. If it's not, we're good; if it is, you need to redo your argument because it was poorly conducted. pragma(address) could be trivially implemented now and I still think it's a logical extension of the language, whereas global property ref functions for this purpose are just hacks. Till D will have full inline control rust will probably already have all the market share in these areas. At least I'm not willing to invest any more effort into this. No need to get agitated over this. We're all on the same boat. Rust also uses intrinsics for volatile loads and stores: http://doc.rust-lang.org/core/intrinsics/. It does have a way to force inlining recommended to use with caution: https://mail.mozilla.org/pipermail/rust-dev/2013-May/004272.html Andrei
Re: H1 2015 Priorities and Bare-Metal Programming
On 2/2/2015 8:36 PM, Daniel Murphy wrote: If so, what corrective action is the user faced with: The user can modify the code to allow it to be inlined. There are a huge number of constructs that cause dmd's inliner to completely give up. If a function _must_ be inlined, the compiler needs to give an error if it fails. I'd like to reexamine those assumptions, and do a little rewinding. The compiler offers a -inline switch, which will inline everything it can. Performance oriented code will use that switch. So why doesn't the compiler inline everything anyway? Because there's a downside - it can make code difficult to symbolically debug, and it makes for difficulties in getting good profile data. Manu was having a problem, though. He wanted inlining turned off globally so he could debug his code, but have it left on for a few functions where not inlining them would make the debug version too slow. pragma(inline,true) tells the compiler that this function is 'hot', and pragma(inline, false) that this function is 'cold'. Knowing the hot and cold paths enables the optimizer to do a better job. There are literally thousands of optimizations applied. Plucking exactly one out and elevating it to a do-or-die status, ignoring the other 999, is a false god. There's far more to a programmer reorganizing his code to make it run faster than just sprinkling it with forceinline pixie dust. There is a lot of value to telling the compiler where the hot and cold parts are, because those cannot be statically determined. But exactly how to achieve that goal really should be left up to the compiler implementer. Doing a better or worse job of that is a quality of implementation issue, not a language specification issue. Perhaps the fault here is calling it pragma(inline,true). Perhaps if it was pragma(hot) and pragma(cold) instead?
Re: H1 2015 Priorities and Bare-Metal Programming
V Mon, 02 Feb 2015 21:53:43 + Dicebot via Digitalmars-d digitalmars-d@puremagic.com napsáno: On Monday, 2 February 2015 at 21:19:05 UTC, Walter Bright wrote: On 2/2/2015 9:17 AM, H. S. Teoh via Digitalmars-d wrote: Walter seems to dislike forced inlining for various reasons, preferring inlining as a hint at the most, and he probably has a point in most cases (let the compiler make the judgment). But in other cases, such as the one in question, the user needs to override the compiler's decision. Currently there's no way to do that, and it's a showstopper for those users. This is a settled issue. After all, I wrote: http://wiki.dlang.org/DIP56 Erm. Quoting the DIP: If a pragma specifies always inline, whether or not the target function(s) are actually inlined is implementation defined, although the implementation will be expected to inline it if practical. This is exactly the absolutely unacceptable part that makes your DIP useless and last discussion has stalled (from my POV) exactly at the point where you refused to negotiate any compromises on that matter. Ok why not add some WARN level? pragma(inline, true, WARN_LEVEL); // always inline WARN_LEVEL = 0 // no warning or error is print WARN_LEVEL = 1 // warning WARN_LEVEL = 2 // error It should be easily control by version condition
Re: H1 2015 Priorities and Bare-Metal Programming
Am Mon, 02 Feb 2015 15:44:21 -0800 schrieb Walter Bright newshou...@digitalmars.com: Also it's a conceptually nice way for typed registers: You can read it as: I've got a Register of type PORT which is an extern variable located add a fixed address. PORT abstract away volatile access. auto a = PORT!0x1234; looks nicer than: pragma(address, 0x1234) PORT a; But because of the every object needs an address rule it wastes at least one byte in the data segment. And your example is actually in TLS. Well I see that you're not even considering adding a simple pragma to help embedded programming. In that case I see absolutely no reason to continue working on that. You guys say we lack expertise so we cannot help directly and you're in search of champions for these areas. But whenever somebody working with D on embedded systems actually comes up with an issue related to embedded programming and propose solutions you simply dismiss it. Often even based on vague statements like that's not a common task. http://wiki.dlang.org/Vision/2015H1 pragma(address) could be trivially implemented now and I still think it's a logical extension of the language, whereas global property ref functions for this purpose are just hacks. Till D will have full inline control rust will probably already have all the market share in these areas. At least I'm not willing to invest any more effort into this.
Re: H1 2015 Priorities and Bare-Metal Programming
On 2/2/2015 8:36 PM, Daniel Murphy wrote: The user can modify the code to allow it to be inlined. There are a huge number of constructs that cause dmd's inliner to completely give up. If a function _must_ be inlined, the compiler needs to give an error if it fails. A separate message with a pragmatic difficulty with your suggestion. Different compilers will have different inlining capabilities. Different versions of the same compiler may behave differently. This means that sometimes a user may get a compilation failure, sometimes not. It's highly brittle. So enter the workaround code. Different compilers and different versions will require different workaround code. Is this really reasonable for users to put up with? And will they really want to be running the workaround code when they upgrade the compiler and now it could have inlined it?
Re: H1 2015 Priorities and Bare-Metal Programming
On 3 February 2015 at 08:28, Walter Bright via Digitalmars-d digitalmars-d@puremagic.com wrote: On 2/2/2015 8:36 PM, Daniel Murphy wrote: If so, what corrective action is the user faced with: The user can modify the code to allow it to be inlined. There are a huge number of constructs that cause dmd's inliner to completely give up. If a function _must_ be inlined, the compiler needs to give an error if it fails. I'd like to reexamine those assumptions, and do a little rewinding. The compiler offers a -inline switch, which will inline everything it can. Performance oriented code will use that switch. So why doesn't the compiler inline everything anyway? Because there's a downside - it can make code difficult to symbolically debug, and it makes for difficulties in getting good profile data. Manu was having a problem, though. He wanted inlining turned off globally so he could debug his code, but have it left on for a few functions where not inlining them would make the debug version too slow. pragma(inline,true) tells the compiler that this function is 'hot', and pragma(inline, false) that this function is 'cold'. Knowing the hot and cold paths enables the optimizer to do a better job. There are literally thousands of optimizations applied. Plucking exactly one out and elevating it to a do-or-die status, ignoring the other 999, is a false god. There's far more to a programmer reorganizing his code to make it run faster than just sprinkling it with forceinline pixie dust. There is a lot of value to telling the compiler where the hot and cold parts are, because those cannot be statically determined. But exactly how to achieve that goal really should be left up to the compiler implementer. Doing a better or worse job of that is a quality of implementation issue, not a language specification issue. Perhaps the fault here is calling it pragma(inline,true). Perhaps if it was pragma(hot) and pragma(cold) instead? pragma(hot/cold) or @attribute(hot/cold) This maps well in gdc's framework too.
Re: H1 2015 Priorities and Bare-Metal Programming
On Tuesday, 3 February 2015 at 09:36:57 UTC, Walter Bright wrote: On 2/3/2015 1:11 AM, Mike wrote: All things being equal, will there be any difference between the resulting binaries for each of these scenarios? No. Another way of putting it: Does pragma(inline, true) simply allow the user to compiler parts of their source file with -inline? Yes. I'm beginning to think a pragma(hot, true/false) might be a better approach, as there are more optimizations that can be done better if the compiler knows which branches are hot or not. I believe both sides in this debate are actually right, but I'm siding with Walter: pragma(inline, true) should not generate a compiler error if a function cannot be inlined. The expressed need by the other side is right on, and that need should have be acknowledged. However, I believe that is a need that Walter did not intend to address with DIP56. IMO, the important thing to explain in DIP56 is the relationship pragma(inline) has to the -inline compiler flag, and that it is not a substitute for future features that may provide strict enforcement. It may even be better, and less controversial, to have a pragma(compiler, -inline -whatever) that gives the user more fine control over, not just -inline, but other compiler options as well...and each compiler flag should have a negative (e.g. -no-inline) conterpart. I don't like hot/cold as it does not convey the effect. Mike
Re: H1 2015 Priorities and Bare-Metal Programming
On Tuesday, 3 February 2015 at 11:33:35 UTC, Mike wrote: I don't like hot/cold as it does not convey the effect. Yeah, I believe LLVM has a register saving calling convention that is cold_cc, so cold would be more suited for functions that are almost never called.
Re: H1 2015 Priorities and Bare-Metal Programming
Am Tue, 03 Feb 2015 07:09:10 -0800 schrieb Andrei Alexandrescu seewebsiteforem...@erdani.org: On 2/3/15 12:17 AM, Johannes Pfau wrote: Well I see that you're not even considering adding a simple pragma to help embedded programming. In that case I see absolutely no reason to continue working on that. You guys say we lack expertise so we cannot help directly and you're in search of champions for these areas. But whenever somebody working with D on embedded systems actually comes up with an issue related to embedded programming and propose solutions you simply dismiss it. Often even based on vague statements like that's not a common task. http://wiki.dlang.org/Vision/2015H1 I think we need to work on better inlining. Which format (pragma vs. attribute etc) is just tactical detail. Clearly there needs to be best effort and won't compile unless it inlines directives. I wasn't part of that discussion and I don't want to be part of it. In the end I don't care how exactly force inline is implemented, as long as it is implemented. Johannes, please let us know whether this is everything needed to float your boat. I'm unclear whether you believe volatile data is needed or not. If it's not, we're good; if it is, you need to redo your argument because it was poorly conducted. I was actually not arguing for any kind of 'volatile data' or replacing volatileLoad/Store (in this discussion). pragma(address) could be trivially implemented now and I still think it's a logical extension of the language, whereas global property ref functions for this purpose are just hacks. Till D will have full inline control rust will probably already have all the market share in these areas. At least I'm not willing to invest any more effort into this. No need to get agitated over this. We're all on the same boat. Rust also uses intrinsics for volatile loads and stores: http://doc.rust-lang.org/core/intrinsics/. It does have a way to force inlining recommended to use with caution: https://mail.mozilla.org/pipermail/rust-dev/2013-May/004272.html Andrei That's a misunderstanding. I don't want to replace volatileLoad/Store intrinsics or any other kind of volatile access. pragma(address) is something completely different. I posted a full example here: https://forum.dlang.org/post/maotpd$1ape$1...@digitalmars.com Basically it adds this feature: extern __gshared int x; //extern variable, default mangled name pragma(mangle, noop) extern __gshared int y; //specify name pragma(address, 0x05) extern __gshared int z; //specify address = It doesn't make z volatile or add any other magic. It simply declares that z is a variable at a _fixed absolute_ location (compile time constant). I even posted a link to a full working implementation, 80 loc. It's very useful on embedded systems where you have data at fixed locations. Why not access this data like any other extern data using variables? It does allow some nice patterns when _combined_ with volatileLoad/Store but this seems to only confuse people. Here's a reduced example for that: http://pastebin.com/RGhKdm9i __builtin_volatile_load = volatileLoad
Re: H1 2015 Priorities and Bare-Metal Programming
On Monday, 2 February 2015 at 23:29:26 UTC, Walter Bright wrote: Now, when it can't inline, do you expect the compiler to produce an error message? Just to add support: Yes, exactly that. If so, what corrective action is the user faced with: Change the function so it can be inlined, or remove the pragma. Yes, there are many other optimisations that matter, but I think inline is special and so do a bunch of other users. Just one example: You write a simple expression that does operation A, in a tight loop. You look at the assembly and see that it is for some reason terribly unnecessarily slow, especially in debug builds. To get the sort of asm you want, you implement a rather nasty, messy little function to do the operation, which internally uses intrinsics and/or lots of casts etc. that you don't want lying around in your normal code. You now have a choice: copy and paste the contents everywhere you need it, or pay for the function call in debug builds (and perhaps optimised too, if the compiler decides it doesn't feel like inlining it). With an actual guarantee of inlining, the problem goes away. Strap a pragma(always_inline) on the function and carry on. Hints are great, but sometimes commands are necessary. Guaranteed inlining makes it possible to have fast debug builds, because it lets you abstract away ugly hand-tuned code at guaranteed zero cost.
Re: H1 2015 Priorities and Bare-Metal Programming
On 2/2/15 5:34 PM, Jonathan M Davis via Digitalmars-d wrote: On Monday, February 02, 2015 13:01:28 Walter Bright via Digitalmars-d wrote: On 2/2/2015 6:43 AM, Manu via Digitalmars-d wrote: I'm pretty sure the only controversy is that you want it to be a pragma, everyone else wants it to be an attribute. That's correct. My reasoning is simple - an attribute defines the semantics of the interface, a pragma gives instructions to the compiler, and does not affect logical semantics. For example, attributes change the name mangling, because it affects the semantic interface. A pragma would not. That makes sense, though the one issue that I see with making it a pragma is the fact that pragmas are supposed to be compiler-specific and not part of the language (at least as I understand it) From http://dlang.org/pragma.html#predefined-pragmas: All implementations must support these, even if by just ignoring them: I would believe that inline would definitely be one that could be ignored, seeing as the code generated whether inlining or not creates the same end result. But you could put it under that list, and then most compilers will support it. Remember, all 3 major compilers here are based on the same front-end code. -Steve
Re: H1 2015 Priorities and Bare-Metal Programming
On Tuesday, 3 February 2015 at 10:36:08 UTC, Walter Bright wrote: Obviously, inlining functions with loops tend to have lower payoffs anyway, because the loop time swamps the function call overhead. I feel a bit awkward disagreeing with you about a topic like this, because of your obviously huge amount of expertise, but this seems just totally wrong in a few situations: Combining loops. Loops where the length is a compile-time constant in the calling context. Loops where a conditional in the loop is a compile-time constant in the calling context.
Re: H1 2015 Priorities and Bare-Metal Programming
On Tuesday, 3 February 2015 at 18:16:20 UTC, Andrei Alexandrescu wrote: static assert(__traits(inlined), Inlining of uart() must be supported.); This is unworkable: 1. You want to be able to turn off inlining for debugging. 2. You would have to wait with evaluating static_assert until after optimization since the optimizer could decide to inline it at a late stage. 3. You should be able to take the address of the function. Walter is right when saying that for D inlining is an optimization and should not be part of the semantics (but you could have an enforced hint). Querying optimization effects at compile time creates dependencies that may lead to computations that don't resolve.
Re: H1 2015 Priorities and Bare-Metal Programming
On Tuesday, 3 February 2015 at 08:31:24 UTC, Walter Bright wrote: On 2/2/2015 8:36 PM, Daniel Murphy wrote: The user can modify the code to allow it to be inlined. There are a huge number of constructs that cause dmd's inliner to completely give up. If a function _must_ be inlined, the compiler needs to give an error if it fails. A separate message with a pragmatic difficulty with your suggestion. Different compilers will have different inlining capabilities. Different versions of the same compiler may behave differently. This means that sometimes a user may get a compilation failure, sometimes not. It's highly brittle. This is _exactly_ why error message is needed. Considering compiler differences and with all inlining bugs in mind it is impossible for developer to reason if certain code will be inlined and rely on it in any fashion. For most programs it is mere inconvenience. For something low-level like embedded programming it can become a deal-breaker making the difference between working and broken program. So enter the workaround code. Different compilers and different versions will require different workaround code. Is this really reasonable for users to put up with? Yes, and this is very good as it will ensure that feature won't be abused in normal programs because it is so hard to deal with. But in barebone world it is very common to develop exclusively for one specific compiler version so it won't be a problem. And will they really want to be running the workaround code when they upgrade the compiler and now it could have inlined it? No, they will really want to not use this feature. Because it has different niche. To sum it up, I don't think your proposal is bad on its own - it simply tries to solve different problems than ones being asked. Manu problem is a different problem than Johannes has - but you seem to consider those identical. Why can't we simply have 3 cases for pragma? pragma(inline, never); // not even with -inline pragma(inline, always); // even without -inline pragma(inline, force); // error if can't inline
Re: H1 2015 Priorities and Bare-Metal Programming
On 2015-02-03 05:05, Daniel Murphy wrote: Walter Bright wrote in message news:maq8ao$2idu$1...@digitalmars.com... Yup. I understand the concern that a compiler would opt out of inlining those if it legally could, but I just cannot see that happening in reality. Modern compilers have been inlining for 25 years now, and they're not likely to just stop doing it. No, the problem is that the code might accidentally contain a construct that is not inlineable. The user will expect it to be inlined, but the compiler will silently fail. eg void myWrapperFunc() { callSomeFunc(999, 123, something); } This function will not be inlined if callSomeFunc has a default arugment that calls alloca, for example. If a hidden failure becomes a compiler error, the user can trivially correct the problem. +1 i am a simple user writing mostly programs to crunch my scientific data. i'd like to move my C code to D. but i deal with GBs of data that i have to sieve through many times. in C, i have lots of repeating integer stunts in the inner loop that must be inlined. i used macros for this. if i cannot mark small helper function in the inner loop so that they are guaranteed to be inlined, i am screwed. i would have to copy and paste lots of code, making the result worse than the C code. i am fine with a compilation error when force_inclining fails, if it comes with a brief explanation of why so i get an idea of how to fix it and make it work. i am not writing or reading assembler and i don't plan to. but i also don't want to be in doubt about the inlining. i just want to get stuff done in a convenient way. if D offers no way other than copypasting code blocks i cannot use it for my work. sadly. /det
Re: H1 2015 Priorities and Bare-Metal Programming
On 2/3/15 9:29 AM, Dicebot wrote: On Tuesday, 3 February 2015 at 08:31:24 UTC, Walter Bright wrote: On 2/2/2015 8:36 PM, Daniel Murphy wrote: The user can modify the code to allow it to be inlined. There are a huge number of constructs that cause dmd's inliner to completely give up. If a function _must_ be inlined, the compiler needs to give an error if it fails. A separate message with a pragmatic difficulty with your suggestion. Different compilers will have different inlining capabilities. Different versions of the same compiler may behave differently. This means that sometimes a user may get a compilation failure, sometimes not. It's highly brittle. This is _exactly_ why error message is needed. I think the best route here - and the most in-the-spirit-of-D - is to provide introspection on whether a function is being inlined or not. Then we can always have in libraries: bool uart(ubyte b) { static assert(__traits(inlined), Inlining of uart() must be supported.); ... } Andrei
Re: H1 2015 Priorities and Bare-Metal Programming
On Tuesday, 3 February 2015 at 18:16:20 UTC, Andrei Alexandrescu wrote: I think the best route here - and the most in-the-spirit-of-D - is to provide introspection on whether a function is being inlined or not. Then we can always have in libraries: bool uart(ubyte b) { static assert(__traits(inlined), Inlining of uart() must be supported.); ... } That is tempting path but I feel that it creates new issues for little benefit. Consider: bool uart(ubyte b) { static if(__traits(inlined)) { // do something that can't be inlined } else { // do something that can be inlined } } And http://wiki.dlang.org/DIP56 is still needed for other purposes - so why create new brittle abstractions when it is possible to use existing dumb ones?
Re: H1 2015 Priorities and Bare-Metal Programming
On Tuesday, February 03, 2015 11:41:35 via Digitalmars-d wrote: On Tuesday, 3 February 2015 at 11:33:35 UTC, Mike wrote: I don't like hot/cold as it does not convey the effect. Yeah, I believe LLVM has a register saving calling convention that is cold_cc, so cold would be more suited for functions that are almost never called. Well, as far as I can tell, that's pretty much exactly what Walter meant by hot and cold - how likely they are to be called, with the idea that the compiler could then use that information to better optimize - be it inlining or some other optimization. - Jonathan M Davis
Re: H1 2015 Priorities and Bare-Metal Programming
On 2015-02-03 at 18:53, captaindet wrote: i am a simple user writing mostly programs to crunch my scientific data. i'd like to move my C code to D. but i deal with GBs of data that i have to sieve through many times. in C, i have lots of repeating integer stunts in the inner loop that must be inlined. i used macros for this. if i cannot mark small helper function in the inner loop so that they are guaranteed to be inlined, i am screwed. i would have to copy and paste lots of code, making the result worse than the C code. Or you'd have to use mixins as you would use C macros. But that's bending over backwards compared to inlining.
Re: H1 2015 Priorities and Bare-Metal Programming
Some perspective from a Rust developer: https://mail.mozilla.org/pipermail/rust-dev/2013-May/004272.html
Re: H1 2015 Priorities and Bare-Metal Programming
On Tuesday, 3 February 2015 at 09:36:57 UTC, Walter Bright wrote: On 2/3/2015 1:11 AM, Mike wrote: All things being equal, will there be any difference between the resulting binaries for each of these scenarios? No. Another way of putting it: Does pragma(inline, true) simply allow the user to compiler parts of their source file with -inline? Yes. pragma(inline, false) paradoxically can be used to improve performance. Consider: if (cond) foo(); else bar(); If cond is nearly always false, then foo() is rarely executed. If the compiler inlines it, it will likely take away registers from being used to inline bar(), and bar() needs those registers. By marking foo() as not inlinable, it won't consume those registers. (Also, inlining foo() may consume much code, making for a less efficient jump around it and making it less likely for the hot code to fit in the cache.) This is why I'm beginning to think a pragma(hot, true/false) might be a better approach, as there are more optimizations that can be done better if the compiler knows which branches are hot or not. I think you're misunderstanding each other. As far as I understand it, Johannes doesn't care much about inline for optimizations. He wants to easily access a fixed memory location for MMIO. Now you're telling him to use volatileLoad and volatileStore to do this which may work but only has a bearable syntax if wrapped. But for his embedded work he needs to be sure that the wrapping is undone and thus needs either pragma(force_inline) or pragma(address). You're against force_inline, but now you're moving the goal posts by arguing against force_inline in the general case of code optimization. But that's not the problem here, we're talking MMIO with addresses embedded in the instruction stream. Besides this: Why should a compiler that has an inliner fail to inline a function marked with force_inline? The result may be undesirable but it should always work at least?
Re: H1 2015 Priorities and Bare-Metal Programming
Walter Bright wrote in message news:maq0rp$2ar8$1...@digitalmars.com... I'd like to reexamine those assumptions, and do a little rewinding. The compiler offers a -inline switch, which will inline everything it can. Performance oriented code will use that switch. So why doesn't the compiler inline everything anyway? Because there's a downside - it can make code difficult to symbolically debug, and it makes for difficulties in getting good profile data. Manu was having a problem, though. He wanted inlining turned off globally so he could debug his code, but have it left on for a few functions where not inlining them would make the debug version too slow. pragma(inline,true) tells the compiler that this function is 'hot', and pragma(inline, false) that this function is 'cold'. Knowing the hot and cold paths enables the optimizer to do a better job. This doesn't make sense to me, because even if a function is 'hot' it still shouldn't be inlined if inlining is turned off. There are literally thousands of optimizations applied. Plucking exactly one out and elevating it to a do-or-die status, ignoring the other 999, is a false god. There's far more to a programmer reorganizing his code to make it run faster than just sprinkling it with forceinline pixie dust. Nobody is suggesting that. forceinline if for when either a) the function is a trivial wrapper and should always always be expanded inline (ie where macros are typically used in C) or b) the compiler's heuristics have failed and profiling/inspecting the generated code has shown that the function should be inlined. There is a lot of value to telling the compiler where the hot and cold parts are, because those cannot be statically determined. But exactly how to achieve that goal really should be left up to the compiler implementer. Doing a better or worse job of that is a quality of implementation issue, not a language specification issue. Yes and no. It is still useful to have a way to tell the compiler exactly what to do, when needed. Eg we can allocate arrays on the stack, even though the compiler could theoretically move heap allocations there without user intervention. Perhaps the fault here is calling it pragma(inline,true). Perhaps if it was pragma(hot) and pragma(cold) instead? That would indeed be a better name, but it still wouldn't be what people are asking for.
Re: H1 2015 Priorities and Bare-Metal Programming
On 03.02.15 10:35, Walter Bright wrote: On 2/3/2015 1:11 AM, Mike wrote: Another way of putting it: Does pragma(inline, true) simply allow the user to compiler parts of their source file with -inline? Yes. Eh, yes :) I see now, errors/warnings are invasive compared to this simple, useful addition. And undesirable generally.
Re: H1 2015 Priorities and Bare-Metal Programming
Walter Bright wrote in message news:maq8ao$2idu$1...@digitalmars.com... Yup. I understand the concern that a compiler would opt out of inlining those if it legally could, but I just cannot see that happening in reality. Modern compilers have been inlining for 25 years now, and they're not likely to just stop doing it. No, the problem is that the code might accidentally contain a construct that is not inlineable. The user will expect it to be inlined, but the compiler will silently fail. eg void myWrapperFunc() { callSomeFunc(999, 123, something); } This function will not be inlined if callSomeFunc has a default arugment that calls alloca, for example. If a hidden failure becomes a compiler error, the user can trivially correct the problem.
Re: H1 2015 Priorities and Bare-Metal Programming
Walter Bright wrote in message news:maq7ra$2huu$1...@digitalmars.com... To not inline trivial functions when presented with forceinline would indeed be perverse, and while legally possible as I've said before no compiler writer would do that. Even dmd (!) has no trouble at all inlining trivial functions. See my alloca example. I mistrust the inliner because I know has problems. But the trouble is, people will use forceinline on very non-trivial functions, and functions where it would actually make things worse, etc., and then to have the compiler error out on them would not be productive. See the Rust link I provided on experience with the use and misuse of forceinline. Why do we have inline assembly? Why do we allow recursion? We can't stop programmers from doing stupid things, and we shouldn't be trying to.
Re: H1 2015 Priorities and Bare-Metal Programming
Walter Bright wrote in message news:maq7f1$2hka$1...@digitalmars.com... On 2/3/2015 1:49 AM, Daniel Murphy wrote: This doesn't make sense to me, because even if a function is 'hot' it still shouldn't be inlined if inlining is turned off. 'hot' can be interpreted to be inline even if inlining is turned off. (That is what Manu wanted.) It's just a naming thing, it's not important. It's still elevating inlining above all other optimizations (and inlining is nothing more than just another optimization). For example, register allocation is critical to code performance, and the optimizer frequently doesn't do the best job of it. So what? It's a pragma used in low-level code. Some C/C++ compilers provide similar hints for loop unrolling, vectorization, etc. It's certainly not worth a keyword or any major language changes, but a pragma doesn't cost anything to add. D has inline assembly for a similar reason - sometimes the programmer knows best. Back in the olden days, with dmc you could individually turn various optimizations on and off. I finally gave up on that because it was useful to nobody. The 'register' keyword was dropped because although it could be used to do better register allocation, in reality it was so misused it would just make things worse. And yet you kept -inline as a separate flag in dmd. Like I said, there are thousands of optimizations in the compiler. They all interact with each other in usually unexpected ways. Focusing on just one in isolation is not likely to yield best results. But with hot-or-not, instead you are giving the compiler useful information to guide its heuristics. Hot-or-not is certainly useful, and probably much more widely useful than forceinline. But that doesn't mean forceinline isn't useful. It's like the old joke where a captain is asked by a colonel how he'd get a flagpole raised. The captain replied with a detailed set of instructions. The colonel said wrong answer, the correct response would be for the captain to say: Sergeant, get that flag pole raised! Hot-or-not gives information to guide the heuristics of the compiler's decisions. For a related example, the compiler assumes that loops are executed 10 times when weighting variables for who gets enregistered. Giving hot-or-not guidance may raise it to 20 for hot, and lower it to 1 for not. There are many places in the optimizer where a cost function is used, not just inlining decisions. Yes, this information is useful. So is forceinline. I understand. And I suggest instead they ask me to get that flagpole raised, sergeant! We have inline assembler because sometimes being explicit is what's needed. I would consider using forceinline in the same situations where inline assembly is a viable option. eg interfacing with hardware, computation kernels
Re: H1 2015 Priorities and Bare-Metal Programming
On 2/1/2015 9:48 PM, Walter Bright wrote: On 2/1/2015 9:21 PM, Daniel Murphy wrote: struct Ports { static ubyte B() { return volatileLoad(cast(ubyte *)0x0025); } static void B(ubyte value) { volatileStore(cast(ubyte *)0x0025, value); } } A somewhat more refined version: import core.bitop; template Ports(T, uint address) { @property T B() { return volatileLoad(cast(T *)address); } @property void B(T value) { volatileStore(cast(T *)address, value); } } alias Ports!(uint, 0x1234) MyPort; uint test(uint x) { MyPort.B(x); MyPort.B(x); return MyPort.B(); } Compiling with: dmd -c foo -O -release -inline gives: _D3foo4testFkZk: pushEAX mov ECX,01234h mov [ECX],EAX mov [ECX],EAX // the redundant store was not optimized away! mov EAX,[ECX] // nor was the common subexpression removed add ESP,4 ret See the volatile semantics noted in the comments.
Re: H1 2015 Priorities and Bare-Metal Programming
Tobias Pankrath wrote in message news:cumpcsdbtreytdxxc...@forum.dlang.org... Besides this: Why should a compiler that has an inliner fail to inline a function marked with force_inline? The result may be undesirable but it should always work at least? The inliner in dmd fails to inline many constructs, loops for example. It would succeed on all of the cases relevant to wrapping mmio.
Re: H1 2015 Priorities and Bare-Metal Programming
On Tuesday, 3 February 2015 at 10:10:43 UTC, Daniel Murphy wrote: Tobias Pankrath wrote in message news:cumpcsdbtreytdxxc...@forum.dlang.org... Besides this: Why should a compiler that has an inliner fail to inline a function marked with force_inline? The result may be undesirable but it should always work at least? The inliner in dmd fails to inline many constructs, loops for example. It would succeed on all of the cases relevant to wrapping mmio. Why couldn't he just copy paste the functions code?
Re: H1 2015 Priorities and Bare-Metal Programming
Tobias Pankrath wrote in message news:zdsqgbuoobnhnjrtp...@forum.dlang.org... Why couldn't he just copy paste the functions code? Why would he want to do that?
Re: H1 2015 Priorities and Bare-Metal Programming
On Tuesday, 3 February 2015 at 10:15:38 UTC, Daniel Murphy wrote: Tobias Pankrath wrote in message news:zdsqgbuoobnhnjrtp...@forum.dlang.org... Why couldn't he just copy paste the functions code? Why would he want to do that? Let me rephrase the question: Why should inlining a function be impossible, if it can be done by a simple AST transformation?
Re: H1 2015 Priorities and Bare-Metal Programming
On 2/3/2015 1:49 AM, Daniel Murphy wrote: This doesn't make sense to me, because even if a function is 'hot' it still shouldn't be inlined if inlining is turned off. 'hot' can be interpreted to be inline even if inlining is turned off. (That is what Manu wanted.) There are literally thousands of optimizations applied. Plucking exactly one out and elevating it to a do-or-die status, ignoring the other 999, is a false god. There's far more to a programmer reorganizing his code to make it run faster than just sprinkling it with forceinline pixie dust. Nobody is suggesting that. forceinline if for when either a) the function is a trivial wrapper and should always always be expanded inline (ie where macros are typically used in C) or b) the compiler's heuristics have failed and profiling/inspecting the generated code has shown that the function should be inlined. It's still elevating inlining above all other optimizations (and inlining is nothing more than just another optimization). For example, register allocation is critical to code performance, and the optimizer frequently doesn't do the best job of it. There is a lot of value to telling the compiler where the hot and cold parts are, because those cannot be statically determined. But exactly how to achieve that goal really should be left up to the compiler implementer. Doing a better or worse job of that is a quality of implementation issue, not a language specification issue. Yes and no. It is still useful to have a way to tell the compiler exactly what to do, when needed. Eg we can allocate arrays on the stack, even though the compiler could theoretically move heap allocations there without user intervention. Back in the olden days, with dmc you could individually turn various optimizations on and off. I finally gave up on that because it was useful to nobody. The 'register' keyword was dropped because although it could be used to do better register allocation, in reality it was so misused it would just make things worse. Like I said, there are thousands of optimizations in the compiler. They all interact with each other in usually unexpected ways. Focusing on just one in isolation is not likely to yield best results. But with hot-or-not, instead you are giving the compiler useful information to guide its heuristics. It's like the old joke where a captain is asked by a colonel how he'd get a flagpole raised. The captain replied with a detailed set of instructions. The colonel said wrong answer, the correct response would be for the captain to say: Sergeant, get that flag pole raised! Hot-or-not gives information to guide the heuristics of the compiler's decisions. For a related example, the compiler assumes that loops are executed 10 times when weighting variables for who gets enregistered. Giving hot-or-not guidance may raise it to 20 for hot, and lower it to 1 for not. There are many places in the optimizer where a cost function is used, not just inlining decisions. Perhaps the fault here is calling it pragma(inline,true). Perhaps if it was pragma(hot) and pragma(cold) instead? That would indeed be a better name, but it still wouldn't be what people are asking for. I understand. And I suggest instead they ask me to get that flagpole raised, sergeant!
Re: H1 2015 Priorities and Bare-Metal Programming
On 2/3/2015 1:56 AM, Daniel Murphy wrote: I don't expect this to be a huge problem, because most functions marked with forceinline would be trivial. eg. setREG(ubyte val) { volatileStore(cast(ubyte*)0x1234, val); } This function only exists to give a nicer interface to the register. If the compiler can't inline it, I want to know about it at compilation time rather than later. Again, it's for those cases that would just be done with macros in C. Where the code should always be inlined but doing it manually the source would lead to maintenance problems. To not inline trivial functions when presented with forceinline would indeed be perverse, and while legally possible as I've said before no compiler writer would do that. Even dmd (!) has no trouble at all inlining trivial functions. But the trouble is, people will use forceinline on very non-trivial functions, and functions where it would actually make things worse, etc., and then to have the compiler error out on them would not be productive. See the Rust link I provided on experience with the use and misuse of forceinline.
Re: H1 2015 Priorities and Bare-Metal Programming
Tobias Pankrath wrote in message news:vzgszrvcxxpethbdl...@forum.dlang.org... Let me rephrase the question: Why should inlining a function be impossible, if it can be done by a simple AST transformation? It's not impossible, dmd's inliner just can't currently do it. The transformation isn't all that simple either.
Re: H1 2015 Priorities and Bare-Metal Programming
On 2/3/2015 2:10 AM, Daniel Murphy wrote: The inliner in dmd fails to inline many constructs, loops for example. It will inline a loop if the function is called at the statement level. The trouble with inlining a loop inside an expression is that is not expressible in the expression tree used in the back end. Obviously, inlining functions with loops tend to have lower payoffs anyway, because the loop time swamps the function call overhead. Inlining a loop can even make things worse, because the loop variables may not get priority for enregistering whereas they would if in a separate function. I.e. it is not a trivial issue of inlining is faster. It would succeed on all of the cases relevant to wrapping mmio. Yup. I understand the concern that a compiler would opt out of inlining those if it legally could, but I just cannot see that happening in reality. Modern compilers have been inlining for 25 years now, and they're not likely to just stop doing it. It's as unlikely as the compiler failing to rewrite: x *= 32; into: x = 5;
Re: H1 2015 Priorities and Bare-Metal Programming
On 2/3/2015 1:11 AM, Mike wrote: All things being equal, will there be any difference between the resulting binaries for each of these scenarios? No. Another way of putting it: Does pragma(inline, true) simply allow the user to compiler parts of their source file with -inline? Yes. pragma(inline, false) paradoxically can be used to improve performance. Consider: if (cond) foo(); else bar(); If cond is nearly always false, then foo() is rarely executed. If the compiler inlines it, it will likely take away registers from being used to inline bar(), and bar() needs those registers. By marking foo() as not inlinable, it won't consume those registers. (Also, inlining foo() may consume much code, making for a less efficient jump around it and making it less likely for the hot code to fit in the cache.) This is why I'm beginning to think a pragma(hot, true/false) might be a better approach, as there are more optimizations that can be done better if the compiler knows which branches are hot or not.
Re: H1 2015 Priorities and Bare-Metal Programming
On Tuesday, 3 February 2015 at 08:28:42 UTC, Walter Bright wrote: The compiler offers a -inline switch, which will inline everything it can. Performance oriented code will use that switch. pragma(inline,true) tells the compiler that this function is 'hot', and pragma(inline, false) that this function is 'cold'. Knowing the hot and cold paths enables the optimizer to do a better job. Assume I'm creating a bare-metal program with 2 functions: an entry point `void _start` and a function that puts a byte in a MMIO UART's send buffer `void send(byte b)`. `_start` calls `send`. There is no phobos, druntime, or any other libraries. It is just my test.d source file only. (Please don't knit-pick this with irrelevant technicalities) scenario A) compile test.d with -inline `_start` is pragma(inline, false) `send` is pragma(inline, true) -- this is redundant, yes? scenario B) compile with -inline `_start` is pragma(inline, false) `send` is pragma(inline) scenario C) compile without -inline `_start` is pragma(inline, false) `send` is pragma(inline, true) All things being equal, will there be any difference between the resulting binaries for each of these scenarios? Another way of putting it: Does pragma(inline, true) simply allow the user to compiler parts of their source file with -inline? Mike
Re: H1 2015 Priorities and Bare-Metal Programming
Walter Bright wrote in message news:maq10s$2avu$1...@digitalmars.com... A separate message with a pragmatic difficulty with your suggestion. Different compilers will have different inlining capabilities. Different versions of the same compiler may behave differently. This means that sometimes a user may get a compilation failure, sometimes not. It's highly brittle. So enter the workaround code. Different compilers and different versions will require different workaround code. Is this really reasonable for users to put up with? And will they really want to be running the workaround code when they upgrade the compiler and now it could have inlined it? I don't expect this to be a huge problem, because most functions marked with forceinline would be trivial. eg. setREG(ubyte val) { volatileStore(cast(ubyte*)0x1234, val); } This function only exists to give a nicer interface to the register. If the compiler can't inline it, I want to know about it at compilation time rather than later. Again, it's for those cases that would just be done with macros in C. Where the code should always be inlined but doing it manually the source would lead to maintenance problems.
Re: H1 2015 Priorities and Bare-Metal Programming
Walter Bright wrote in message news:maq48d$2enr$1...@digitalmars.com... Some perspective from a Rust developer: https://mail.mozilla.org/pipermail/rust-dev/2013-May/004272.html I think that's mostly an argument against misuse of forceinline.
Re: H1 2015 Priorities and Bare-Metal Programming
On 2/2/2015 1:24 PM, Johannes Pfau wrote: What's your argument? That it still generates 2 instructions in the simplest case? That's an X86 specific detail. On ARM and other RISC architectures there is a difference between loading a literal (code into the instruction) or loading a runtime value. On AVR gcc can even rewrite bit-sized stores into set-bit and loads into read-bit instructions, but it needs to know the addresses at compile time. If you don't believe me get an AVR/ARM compiler and try it. A code generator for a specific architecture will naturally generate code that caters to it. volatileLoad()/Store() does not impede that, and a pragma(address) will not help.
Re: H1 2015 Priorities and Bare-Metal Programming
On 2/2/2015 10:44 AM, Iain Buclaw via Digitalmars-d wrote: On 2 February 2015 at 17:43, Andrei Alexandrescu via Digitalmars-d digitalmars-d@puremagic.com wrote: On 2/2/15 9:23 AM, Iain Buclaw via Digitalmars-d wrote: That code doesn't work with DMD. http://goo.gl/hgsHg0 Has that been filed yet? -- Andrei https://issues.dlang.org/show_bug.cgi?id=14114 The optimizer is regarding as a null any value being dereferenced that is less than 4096. Whether this is a bug or not is debatable.
Re: H1 2015 Priorities and Bare-Metal Programming
Am Mon, 02 Feb 2015 13:15:13 -0800 schrieb Walter Bright newshou...@digitalmars.com: On 2/2/2015 9:15 AM, Johannes Pfau wrote: It's also necessary that the compiler knows after inlining that the address is a literal. Loading data from fixed literal addresses produces different, more efficient code than loading from an runtime address. As the function code will generally be written for runtime values the compiler must optimize after inlining to recognize the inlined code deals with literals. import core.bitop; uint test() { return volatileLoad(cast(uint *)0x1234); } --- dmd -c foo --- _D3foo4testFZkL: mov EAX,01234h mov EAX,[EAX] ret Note that was an unoptimized build. What's your argument? That it still generates 2 instructions in the simplest case? That's an X86 specific detail. On ARM and other RISC architectures there is a difference between loading a literal (code into the instruction) or loading a runtime value. On AVR gcc can even rewrite bit-sized stores into set-bit and loads into read-bit instructions, but it needs to know the addresses at compile time. If you don't believe me get an AVR/ARM compiler and try it.
Re: H1 2015 Priorities and Bare-Metal Programming
On 2/2/2015 6:58 AM, Manu via Digitalmars-d wrote: They need to be wrapped to be useful, Wrapping them is a subjective matter of taste. And before anyone says I don't know what I'm talking about, I used to write embedded systems software. :-)
Re: H1 2015 Priorities and Bare-Metal Programming
On 2/2/2015 9:06 AM, Johannes Pfau wrote: _Dmain: push rbp movrbp,rsp subrsp,0x10 movrax,0x5 == movQWORD PTR [rbp-0x8],rax movecx,DWORD PTR [rax] == a register based load The instruction it should generate is mov ecx, [0x5] In 64 bit mode, there is no direct addressing like that. The above would be relative to the instruction pointer, which is RIP, and is actually: mov ECX, 5[RIP] So, to load address location 5, you would have to load it into a register first. (You'd be right for 32 bit x86. But also, all 32 bit x86's have an MMU rather than direct addressing, and it would be strange to set up the x86 embedded system to use MMIO rather than the IO instructions, which are designed for that purpose.) Not sure if it's actually more efficient on X86 but it makes a huge difference on real microcontroller architectures. What addressing mode is generated by the back end has nothing whatsoever to do with using volatileLoad() or pragma(address). To reiterate, volatileLoad() and volatileStore() are not reordered by the optimizer, and replacing them with pragma(address) is not going to make for better code generation. The only real issue is the forceinline one.
Re: H1 2015 Priorities and Bare-Metal Programming
Am Mon, 02 Feb 2015 12:39:28 -0500 schrieb Steven Schveighoffer schvei...@yahoo.com: On 2/2/15 12:06 PM, Johannes Pfau wrote: Am Mon, 02 Feb 2015 02:49:48 -0800 schrieb Walter Bright newshou...@digitalmars.com: Please try it before deciding it does not work. I guess one ad hominem wasn't enough? Sorry, I'm not really vested in this discussion at all, but I don't think you realize what ad hominem means. http://en.wikipedia.org/wiki/Ad_hominem -Steve Ad hominem literally means 'to the person'. en/wikipedia reduces that to character but other definitions (de/wikipedia) include all arguments against a person instead of to the content of the arguments. Walter implicitly doubted my qualification in his last reply by claiming I don't understand how intrinsics work. Here he basically said I didn't even try to run the code and just making up issues. He's essentially saying I'm dishonest. He didn't respond to the content of my arguments. This is clearly not an argument, it's an attack on my reputation. So how is this not ad hominem?
Re: H1 2015 Priorities and Bare-Metal Programming
On 2/2/2015 6:43 AM, Manu via Digitalmars-d wrote: I'm pretty sure the only controversy is that you want it to be a pragma, everyone else wants it to be an attribute. That's correct. My reasoning is simple - an attribute defines the semantics of the interface, a pragma gives instructions to the compiler, and does not affect logical semantics. For example, attributes change the name mangling, because it affects the semantic interface. A pragma would not.
Re: H1 2015 Priorities and Bare-Metal Programming
On 2/2/2015 9:15 AM, Johannes Pfau wrote: It's also necessary that the compiler knows after inlining that the address is a literal. Loading data from fixed literal addresses produces different, more efficient code than loading from an runtime address. As the function code will generally be written for runtime values the compiler must optimize after inlining to recognize the inlined code deals with literals. import core.bitop; uint test() { return volatileLoad(cast(uint *)0x1234); } --- dmd -c foo --- _D3foo4testFZkL: mov EAX,01234h mov EAX,[EAX] ret Note that was an unoptimized build.
Re: H1 2015 Priorities and Bare-Metal Programming
On Monday, 2 February 2015 at 21:01:30 UTC, Johannes Pfau wrote: Am Mon, 02 Feb 2015 12:39:28 -0500 schrieb Steven Schveighoffer schvei...@yahoo.com: On 2/2/15 12:06 PM, Johannes Pfau wrote: Am Mon, 02 Feb 2015 02:49:48 -0800 schrieb Walter Bright newshou...@digitalmars.com: Please try it before deciding it does not work. I guess one ad hominem wasn't enough? Sorry, I'm not really vested in this discussion at all, but I don't think you realize what ad hominem means. http://en.wikipedia.org/wiki/Ad_hominem -Steve Ad hominem literally means 'to the person'. en/wikipedia reduces that to character but other definitions (de/wikipedia) include all arguments against a person instead of to the content of the arguments. Walter implicitly doubted my qualification in his last reply by claiming I don't understand how intrinsics work. Here he basically said I didn't even try to run the code and just making up issues. He's essentially saying I'm dishonest. He didn't respond to the content of my arguments. This is clearly not an argument, it's an attack on my reputation. So how is this not ad hominem? I agree it was ad hominem, but I don't think Walter implied you were dishonest, so much as *ignorant* (i.e. of what would *really* happen if you just used the products as intended) - which implication is still bad, if proven false, but not quite as bad as calling you dishonest...
Re: H1 2015 Priorities and Bare-Metal Programming
On Monday, 2 February 2015 at 21:19:05 UTC, Walter Bright wrote: On 2/2/2015 9:17 AM, H. S. Teoh via Digitalmars-d wrote: Walter seems to dislike forced inlining for various reasons, preferring inlining as a hint at the most, and he probably has a point in most cases (let the compiler make the judgment). But in other cases, such as the one in question, the user needs to override the compiler's decision. Currently there's no way to do that, and it's a showstopper for those users. This is a settled issue. After all, I wrote: http://wiki.dlang.org/DIP56 Erm. Quoting the DIP: If a pragma specifies always inline, whether or not the target function(s) are actually inlined is implementation defined, although the implementation will be expected to inline it if practical. This is exactly the absolutely unacceptable part that makes your DIP useless and last discussion has stalled (from my POV) exactly at the point where you refused to negotiate any compromises on that matter.
Re: H1 2015 Priorities and Bare-Metal Programming
On Mon, Feb 02, 2015 at 09:53:43PM +, Dicebot via Digitalmars-d wrote: On Monday, 2 February 2015 at 21:19:05 UTC, Walter Bright wrote: On 2/2/2015 9:17 AM, H. S. Teoh via Digitalmars-d wrote: Walter seems to dislike forced inlining for various reasons, preferring inlining as a hint at the most, and he probably has a point in most cases (let the compiler make the judgment). But in other cases, such as the one in question, the user needs to override the compiler's decision. Currently there's no way to do that, and it's a showstopper for those users. This is a settled issue. After all, I wrote: http://wiki.dlang.org/DIP56 Erm. Quoting the DIP: If a pragma specifies always inline, whether or not the target function(s) are actually inlined is implementation defined, although the implementation will be expected to inline it if practical. This is exactly the absolutely unacceptable part that makes your DIP useless and last discussion has stalled (from my POV) exactly at the point where you refused to negotiate any compromises on that matter. Yes, and this is the sore point with the force-inline proponents. You're dangling the carrot of inline control in front of them, but snatch it away with the implementation-defined part. That might as well not be any control at all, since the default inlining behaviour is already implementation-defined; having two states of implementation-defined behaviour doesn't give us anything better than what we already have. The whole point behind inline control is to give the programmer a way to *override* the compiler's decision when the compiler's decision is wrong. By saying it's implementation-defined, you put the decision back in the compiler's hands, and so the compiler may continue making the same wrong decision, and we have achieved nothing at all. Of course, this is the pessimistic interpretation of DIP56, and perhaps, as a bystander, I can hazard a guess as to why forced inlining is left to the compiler's discretion -- one might want a compiler flag to disable forced inlining for debugging purposes, for example. However, without being more specific about exactly under what circumstances forced inlining is not obeyed, DIP56 leaves the decision completely in the implementor's hands, and users have no choice but to assume the worst. T -- Notwithstanding the eloquent discontent that you have just respectfully expressed at length against my verbal capabilities, I am afraid that I must unfortunately bring it to your attention that I am, in fact, NOT verbose.
Re: H1 2015 Priorities and Bare-Metal Programming
On 02/02/2015 12:56 PM, Timo Sintonen wrote: Developers make things _for_ users. Hear hear, That's a point I always feel deserves far more attention and deliberate, conscious appreciation than it typically gets. (Just a general observation of the overall software development world, not necessarily specific to this discussion.) If this was a commercial product, lack of listening users needs would be fatal to the company. Man do I wish that really was true. It's likely true for a lot of small businesses, but the corporate big dogs can and regularly do get by fine without listening :( That's the power of oligopoly and mindshare.
Re: H1 2015 Priorities and Bare-Metal Programming
On 2/2/2015 9:17 AM, H. S. Teoh via Digitalmars-d wrote: On Mon, Feb 02, 2015 at 08:55:59AM -0800, Andrei Alexandrescu via Digitalmars-d wrote: On 2/2/15 8:42 AM, Johannes Pfau wrote: Again the problem is not volatileLoad/Store which translate to single instructions it's wrappers. So does the argument boil down to better inlining control and enforcement? -- Andrei FWIW, from the POV of a bystander, the point is that force-inline for certain things (namely wrappers around certain intrinsics, like operator overloads for atomics, what-have-you) is necessary in order to guarantee that function call overhead will not be incurred no matter what. Walter seems to dislike forced inlining for various reasons, preferring inlining as a hint at the most, and he probably has a point in most cases (let the compiler make the judgment). But in other cases, such as the one in question, the user needs to override the compiler's decision. Currently there's no way to do that, and it's a showstopper for those users. This is a settled issue. After all, I wrote: http://wiki.dlang.org/DIP56 Unless we're proposing to flood the compiler with intrinsics, one for every possible operator overload of volatile loads/stores, which I think should be obviously infeasible. And even then, you still might miss one or two other obscure wrappers that users might discover that they need. It seems reasonable that instead of burdening the compiler (and compiler maintainers, and porters) to keep up with an ever-expanding set of intrinsics, making use of the language to express what is needed via forced-inline functions is a better way to do things. I agree that adding more language features instead of inline control is wrong.
Re: H1 2015 Priorities and Bare-Metal Programming
On 2/2/2015 2:34 PM, Jonathan M Davis via Digitalmars-d wrote: That makes sense, though the one issue that I see with making it a pragma is the fact that pragmas are supposed to be compiler-specific and not part of the language (at least as I understand it), and I would expect that anyone looking to force inlining would want it guaranteed regardless of the compiler. Of course, all compilers could implement it, and with a shared frontend for most of the D compilers, it would likely be in most of them anyway, but if it's a pragma, it does seem like it wouldn't necessarily be guaranteed. There are already several pragmas implemented. I'd be surprised if some compiler chose not to do them. As I wrote to Dicebot, compiler implementers aim to please their users as much as possible. They do not attempt to perversely implement the spec in such a way as to conform to the letter but be useless in practice. For example, I could easily write a Standard conformant C compiler that would be useless. But nobody does such a thing unless they are incompetent, and even if they did, why would anyone waste their time using such a product?
Re: H1 2015 Priorities and Bare-Metal Programming
On Monday, 2 February 2015 at 23:29:26 UTC, Walter Bright wrote: Now, when it can't inline, do you expect the compiler to produce an error message? Yes, this is the very point of such feature - telling that if code can't be inlined, it is effectively unusable. If so, what corrective action is the user faced with: pragma(inline, some expression that determines the compiler version and evaluates to true only if this particular function can be inlined); Prohibit this generally useless misfeature and keep base semantics useful.
Re: H1 2015 Priorities and Bare-Metal Programming
On 03.02.15 00:29, Walter Bright wrote: Now, when it can't inline, do you expect the compiler to produce an error message? Or warning? Microcontroller programmers like to look at produced code, no need to force them :)
Re: H1 2015 Priorities and Bare-Metal Programming
On 2/2/2015 2:30 PM, Johannes Pfau wrote: I does: if the backend can't know that a value is known at compile time it cant use absolute addresses: void test(ubyte* ptr) { volatileLoad(ptr); //Can't use literal addressing might be runtime value } The context here is that pragma(address) allows avoiding one wrapper function. See below. Again, that is simply an inlining issue. The pragma(address, 0x05) makes sure that the compiler backend always knows that PORTA is at 0x05. Constant propagation and inlining do that. Both are standard optimizations that every compiler does. Adding language features on the presumption that compilers won't do that is like trying to fix the broken engine in your car by adding another engine in the trunk. -O mostly fixes performance problems, but adding an additional property function is still much uglier than declaring an extern variable with an address in many ways. (compiler bugs, Language features should not be added because of compiler bugs. user-facing code, Library wrapper types will be showing up more and more. How nice they are is up to the library designer. debug info, ...) Symbolic debugging is always going to be an issue until there are debuggers that are better designed to work with D. Also it's a conceptually nice way for typed registers: You can read it as: I've got a Register of type PORT which is an extern variable located add a fixed address. PORT abstract away volatile access. auto a = PORT!0x1234; looks nicer than: pragma(address, 0x1234) PORT a;
Re: H1 2015 Priorities and Bare-Metal Programming
On 2/2/2015 1:53 PM, Dicebot wrote: http://wiki.dlang.org/DIP56 Erm. Quoting the DIP: If a pragma specifies always inline, whether or not the target function(s) are actually inlined is implementation defined, although the implementation will be expected to inline it if practical. This is exactly the absolutely unacceptable part that makes your DIP useless and last discussion has stalled (from my POV) exactly at the point where you refused to negotiate any compromises on that matter. That interpretation is a little over the top. Any reasonable implementation is going to do what it can to inline when asked to - people who write compilers do not try to perversely interpret the spec in order to be as useless as possible. (After all, we are not writing the tax code!) Now, when it can't inline, do you expect the compiler to produce an error message? If so, what corrective action is the user faced with: pragma(inline, some expression that determines the compiler version and evaluates to true only if this particular function can be inlined); ? Such will necessarily be brittle, and of dubious utility.
Re: H1 2015 Priorities and Bare-Metal Programming
Am Mon, 02 Feb 2015 13:44:41 -0800 schrieb Walter Bright newshou...@digitalmars.com: On 2/2/2015 9:06 AM, Johannes Pfau wrote: _Dmain: push rbp movrbp,rsp subrsp,0x10 movrax,0x5 == movQWORD PTR [rbp-0x8],rax movecx,DWORD PTR [rax] == a register based load The instruction it should generate is mov ecx, [0x5] In 64 bit mode, there is no direct addressing like that. The above would be relative to the instruction pointer, which is RIP, and is actually: mov ECX, 5[RIP] So, to load address location 5, you would have to load it into a register first. (You'd be right for 32 bit x86. But also, all 32 bit x86's have an MMU rather than direct addressing, and it would be strange to set up the x86 embedded system to use MMIO rather than the IO instructions, which are designed for that purpose.) Well, as I said it's different on RISC. I'm mainly programming for ARM, AVR MSP430 and similar systems, not X86. Not sure if it's actually more efficient on X86 but it makes a huge difference on real microcontroller architectures. What addressing mode is generated by the back end has nothing whatsoever to do with using volatileLoad() or pragma(address). I does: if the backend can't know that a value is known at compile time it cant use absolute addresses: void test(ubyte* ptr) { volatileLoad(ptr); //Can't use literal addressing might be runtime value } The context here is that pragma(address) allows avoiding one wrapper function. See below. ARM can code address literals into instructions. So you end up with one instruction for a load from a compile time known address. To reiterate, volatileLoad() and volatileStore() are not reordered by the optimizer, and replacing them with pragma(address) is not going to make for better code generation. The only real issue is the forceinline one. I think we're talking different languages. Nobody ever proposed pragma(address) to replace volatileLoad. It's meant to be used together with the volatile intrinsics like this: - import core.bitop; struct Volatile(T) { private: T _store; public: @disable this(this); /** * Performs 1 load followed by 1 store */ @attribute(inlineonly) void opOpAssign(string op)(in T rhs) nothrow @trusted { T val = volatileLoad(_store); mixin(val ~ op ~ = rhs;); volatileStore(_store, val); } //In reality, much more complicated wrappers are possible //http://pastebin.com/RGhKdm9i } pragma(address, 0x05) extern __gshared Volatile!ubyte PORTA; //... PORTA |= 0b_0001; auto addr = PORTA; - The pragma(address, 0x05) makes sure that the compiler backend always knows that PORTA is at 0x05. Thinks like PORTA become trivial and the compiler backend has exactly the same knowledge as if you'd use C volatile = all optimizations apply. And if you call opopAssign the backend knows that the this pointer is a compile time literal value and generates exactly the same code as if you wrote T val = volatileLoad(0x05); val ~ op ~ = rhs; volatileStore(0x05, val); but if you instead write @property ref Volatile!ubyte PORTA() { return *(cast(Volatile!(ubyte)*)0x05) } PORTA |= now calls a function behind the scenes. The backend does not immediately know that PORTA is always 0x05. Also the this pointer in opopAssign is no longer a compile time constant. And this is were the constant/runtime value code gen difference discussed above matters. -O mostly fixes performance problems, but adding an additional property function is still much uglier than declaring an extern variable with an address in many ways. (compiler bugs, user-facing code, debug info, ...) Also it's a conceptually nice way for typed registers: You can read it as: I've got a Register of type PORT which is an extern variable located add a fixed address. PORT abstract away volatile access.
Re: H1 2015 Priorities and Bare-Metal Programming
On Monday, February 02, 2015 13:01:28 Walter Bright via Digitalmars-d wrote: On 2/2/2015 6:43 AM, Manu via Digitalmars-d wrote: I'm pretty sure the only controversy is that you want it to be a pragma, everyone else wants it to be an attribute. That's correct. My reasoning is simple - an attribute defines the semantics of the interface, a pragma gives instructions to the compiler, and does not affect logical semantics. For example, attributes change the name mangling, because it affects the semantic interface. A pragma would not. That makes sense, though the one issue that I see with making it a pragma is the fact that pragmas are supposed to be compiler-specific and not part of the language (at least as I understand it), and I would expect that anyone looking to force inlining would want it guaranteed regardless of the compiler. Of course, all compilers could implement it, and with a shared frontend for most of the D compilers, it would likely be in most of them anyway, but if it's a pragma, it does seem like it wouldn't necessarily be guaranteed. - Jonathan M Davis
Re: H1 2015 Priorities and Bare-Metal Programming
On 2/2/2015 1:39 AM, Johannes Pfau wrote: No, it doesn't even come close. * Ports.B += 7 doesn't work. This should not be done with MMIO because the read and write cycles generated are ill-defined and vary based on obscure backend details. In order to implement it you need a Volatile!ubyte wrapper, return by ref and avoid some compiler bugs What compiler bugs? * You do need force-inline to produce halfway decent code Nope. volatileLoad() and volatileStore() do not produce function calls. * You also need to enable backend optimization to produce decent code Not any more true than with volatile types, because the compiler intrinsic actually translates to a volatile type, not a function call. You are making a lot of assumptions that volatileLoad() and volatileStore() do not work. Please try it, examine the generated code, and see.
Re: H1 2015 Priorities and Bare-Metal Programming
On 2 February 2015 at 10:57, Walter Bright via Digitalmars-d digitalmars-d@puremagic.com wrote: On 2/2/2015 1:39 AM, Johannes Pfau wrote: No, it doesn't even come close. * Ports.B += 7 doesn't work. This should not be done with MMIO because the read and write cycles generated are ill-defined and vary based on obscure backend details. In order to implement it you need a Volatile!ubyte wrapper, return by ref and avoid some compiler bugs What compiler bugs? * You do need force-inline to produce halfway decent code Nope. volatileLoad() and volatileStore() do not produce function calls. I think he was referring to ubyte B() and void B(), and not the load/store intrinsics themselves.
Re: H1 2015 Priorities and Bare-Metal Programming
On 2/2/2015 1:24 AM, Johannes Pfau wrote: Usually those people just don't use volatile as long as their code works. Once it breaks they add volatile everywhere till it works again. Having a volatile type is not going to make things better for such people. In fact, it may make things worse. It's harder to muck up volatileLoad() and volatileStore(). The compiler intrinsics participate in all optimizations. Not sure what that's supposed to mean. The backend can generate more efficient code if it knows that an address is a literal value. If you add wrappers (void f(void* p) {volatileLoad(p)}) the information that p is a constant is a literal is lost and needs to be recovered by the backend, which is only done with enabled backend optimizations. Please try it before deciding it does not work.
Re: H1 2015 Priorities and Bare-Metal Programming
On Monday, 2 February 2015 at 09:22:41 UTC, Walter Bright wrote: On 2/1/2015 11:16 PM, Iain Buclaw via Digitalmars-d wrote: Where is the @property? :) yeah, yeah !! BTW, when D's property will finally be cleaned up?... I did miss something, or there is still a -property flag out there?
Re: H1 2015 Priorities and Bare-Metal Programming
Am Sun, 01 Feb 2015 21:48:40 -0800 schrieb Walter Bright newshou...@digitalmars.com: On 2/1/2015 9:21 PM, Daniel Murphy wrote: Walter Bright wrote in message news:mam6qe$15nu$1...@digitalmars.com... We also need a pragma(address) to complement pragma(mangle). What would that do? It would allow naming a memory address, similar to using .org in assembly. eg pragma(address, 0x0025) shared ubyte PORTB; static assert(PORTB == cast(ubyte*)0x0025); This is a much nicer version of C's #define PORTB (*(volatile unsigned char *)0x0025) That's what I suspected :-) struct Ports { static ubyte B() { return volatileLoad(cast(ubyte *)0x0025); } static void B(ubyte value) { volatileStore(cast(ubyte *)0x0025, value); } } ... Ports.B = 7; foo(Ports.B); gets the job done. No, it doesn't even come close. * Ports.B += 7 doesn't work. In order to implement it you need a Volatile!ubyte wrapper, return by ref and avoid some compiler bugs * You do need force-inline to produce halfway decent code * You also need to enable backend optimization to produce decent code This has been discussed before and the best way to express this in the language is @property ref @attribute(inlineonly) Volatile!ubyte PORTB() {return cast((Volatile!ubyte)*)(0x0025)} You need cross-module inlining, a way to avoid actually generating a callable function (inlineonly =/= forceinline) to avoid bloat and optimization must be enabled. Cross-module inlining is not supported in GDC and not trivial to implement. Also the code looks ugly. pragma address is: * Easy to implement (https://github.com/D-Programming-microD/GDC/commit/a4027b6d9c53a186c142244553861af8cce5492f) * A logical, consistent paradigm (if there are extern variables with specific names, why no variables with specific address) and extension of pragma(mangle) * Easy to use * Enforces that the address is a compile time constant, produce perfect ASM code even without optimization * No need to define a wrapper function, with all the consequences that hack requires (inlineonly) * Apparently already in other languages Given that we can implement pragmas as compiler backend vendors the bar to include pragmas into dmd also shouldn't be too high. Otherwise it'll just be implemented as a compiler dependent pragma.
Re: H1 2015 Priorities and Bare-Metal Programming
Walter Bright wrote in message news:map18m$1dvv$1...@digitalmars.com... Now, when it can't inline, do you expect the compiler to produce an error message? Yes. If so, what corrective action is the user faced with: The user can modify the code to allow it to be inlined. There are a huge number of constructs that cause dmd's inliner to completely give up. If a function _must_ be inlined, the compiler needs to give an error if it fails.
Re: H1 2015 Priorities and Bare-Metal Programming
Just go with __gshared. Or even better, avoid globals ;).
Re: H1 2015 Priorities and Bare-Metal Programming
Johannes Pfau wrote in message news:maotpd$1ape$1...@digitalmars.com... but if you instead write @property ref Volatile!ubyte PORTA() { return *(cast(Volatile!(ubyte)*)0x05) } PORTA |= now calls a function behind the scenes. The backend does not immediately know that PORTA is always 0x05. Also the this pointer in opopAssign is no longer a compile time constant. And this is were the constant/runtime value code gen difference discussed above matters. It may not immediately know, but with guaranteed inlining it becomes a non-issue. -O mostly fixes performance problems, but adding an additional property function is still much uglier than declaring an extern variable with an address in many ways. (compiler bugs, user-facing code, debug info, ...) I agree that pragma(address) is nicer.
Re: H1 2015 Priorities and Bare-Metal Programming
On 2 Feb 2015 23:45, Walter Bright via Digitalmars-d digitalmars-d@puremagic.com wrote: On 2/2/2015 2:30 PM, Johannes Pfau wrote: I does: if the backend can't know that a value is known at compile time it cant use absolute addresses: void test(ubyte* ptr) { volatileLoad(ptr); //Can't use literal addressing might be runtime value } The context here is that pragma(address) allows avoiding one wrapper function. See below. Again, that is simply an inlining issue. The pragma(address, 0x05) makes sure that the compiler backend always knows that PORTA is at 0x05. Constant propagation and inlining do that. Both are standard optimizations that every compiler does. Adding language features on the presumption that compilers won't do that is like trying to fix the broken engine in your car by adding another engine in the trunk. -O mostly fixes performance problems, but adding an additional property function is still much uglier than declaring an extern variable with an address in many ways. (compiler bugs, Language features should not be added because of compiler bugs. user-facing code, Library wrapper types will be showing up more and more. How nice they are is up to the library designer. debug info, ...) Symbolic debugging is always going to be an issue until there are debuggers that are better designed to work with D. It's more a marriage than a one way street. DMD still needs to produce the goods in order for debuggers to turn it into meaningful data. Iain.
Re: H1 2015 Priorities and Bare-Metal Programming
On Monday, 2 February 2015 at 14:43:22 UTC, Manu wrote: On 2 February 2015 at 07:47, Walter Bright via Digitalmars-d digitalmars-d@puremagic.com wrote: On 2/1/2015 3:29 AM, weaselcat wrote: On Sunday, 1 February 2015 at 11:22:04 UTC, Johannes Pfau wrote: which perform as well as C code, but only with force-inline why is this still not part of the language? I'm not sure of anything else that has been repeatedly asked for without any good counterarguments. Because http://wiki.dlang.org/DIP56 generated nothing but controversy. I'm pretty sure the only controversy is that you want it to be a pragma, everyone else wants it to be an attribute. I don't think anyone has argued against a force_inline. I don't care if it is a pragma or attribute or whatever, it just *needs to exist*.
Re: H1 2015 Priorities and Bare-Metal Programming
On Monday, 2 February 2015 at 14:43:22 UTC, Manu wrote: I'm pretty sure the only controversy is that you want it to be a pragma, everyone else wants it to be an attribute. I don't think anyone has argued against a force_inline. I also support pragma, but didn't agree with proposed semantics (that force_inline doesn't actually force). And Walter completely disagreed with any alternative proposals making any further progress in search of compromise effectively impossible.
Re: H1 2015 Priorities and Bare-Metal Programming
On 2 February 2015 at 07:47, Walter Bright via Digitalmars-d digitalmars-d@puremagic.com wrote: On 2/1/2015 3:29 AM, weaselcat wrote: On Sunday, 1 February 2015 at 11:22:04 UTC, Johannes Pfau wrote: which perform as well as C code, but only with force-inline why is this still not part of the language? I'm not sure of anything else that has been repeatedly asked for without any good counterarguments. Because http://wiki.dlang.org/DIP56 generated nothing but controversy. I'm pretty sure the only controversy is that you want it to be a pragma, everyone else wants it to be an attribute. I don't think anyone has argued against a force_inline.
Re: H1 2015 Priorities and Bare-Metal Programming
On 2 February 2015 at 20:57, Walter Bright via Digitalmars-d digitalmars-d@puremagic.com wrote: On 2/2/2015 1:39 AM, Johannes Pfau wrote: * You do need force-inline to produce halfway decent code Nope. volatileLoad() and volatileStore() do not produce function calls. They need to be wrapped to be useful, and in this case, the wrapping should not result in function calls either. I have the same problem with simd intrinsics. Intrinsics are useless if they are to be wrapped by a function call to use them in a practical way.
Re: H1 2015 Priorities and Bare-Metal Programming
On 2/2/15 6:55 AM, Dicebot wrote: On Monday, 2 February 2015 at 14:43:22 UTC, Manu wrote: I'm pretty sure the only controversy is that you want it to be a pragma, everyone else wants it to be an attribute. I don't think anyone has argued against a force_inline. I also support pragma, but didn't agree with proposed semantics (that force_inline doesn't actually force). And Walter completely disagreed with any alternative proposals making any further progress in search of compromise effectively impossible. I think it's time to reopen that negotiation. -- Andrei
Re: H1 2015 Priorities and Bare-Metal Programming
On 2/2/15 8:42 AM, Johannes Pfau wrote: Again the problem is not volatileLoad/Store which translate to single instructions it's wrappers. So does the argument boil down to better inlining control and enforcement? -- Andrei
Re: H1 2015 Priorities and Bare-Metal Programming
Am Mon, 02 Feb 2015 02:57:28 -0800 schrieb Walter Bright newshou...@digitalmars.com: On 2/2/2015 1:39 AM, Johannes Pfau wrote: No, it doesn't even come close. * Ports.B += 7 doesn't work. This should not be done with MMIO because the read and write cycles generated are ill-defined and vary based on obscure backend details. Operator overloading? If +=7 is implemented as a volatileLoad + modify volatileStore as wrapper would your whole point is void. In order to implement it you need a Volatile!ubyte wrapper, return by ref and avoid some compiler bugs What compiler bugs? * You do need force-inline to produce halfway decent code Nope. volatileLoad() and volatileStore() do not produce function calls. This was a reply to your example. Your example used a function which wrapped volatileLoad: static ubyte B() { return volatileLoad(cast(ubyte *)0x0025); } I think it's obvious were a force-inline would go. * You also need to enable backend optimization to produce decent code Not any more true than with volatile types, because the compiler intrinsic actually translates to a volatile type, not a function call. I've explained that in detail: https://forum.dlang.org/post/manfpc$2v0u$1...@digitalmars.com I'm not going to explain it again. Also your focus on intrinsics is wrong, I'm not talking about the intrinsics I'm talking about wrappers. You are making a lot of assumptions that volatileLoad() and volatileStore() do not work. Please try it, examine the generated code, and see. Nice ad hominem. I've implemented volatileLoad/store for GDC long before you implemented it in DMD. I've written a Volatile!T wrapper, a Register wrapper and a tool which scrapes datasheets for Register definitions and automatically generates Register definitions. I've also fixed GDC for 8bit AVR processors and run and tested D code with volatileLoad/Store on these processors. Three months ago. I'm not making any assumptions about how volatileLoad/Store work, I know it quite well. Again the problem is not volatileLoad/Store which translate to single instructions it's wrappers. All the points I made come from experience implementing and using these wrappers. I don't see any point continuing this discussion as long as you don't take me seriously. At least you could read my replies properly.
Re: H1 2015 Priorities and Bare-Metal Programming
On 2 February 2015 at 17:43, Andrei Alexandrescu via Digitalmars-d digitalmars-d@puremagic.com wrote: On 2/2/15 9:23 AM, Iain Buclaw via Digitalmars-d wrote: That code doesn't work with DMD. http://goo.gl/hgsHg0 Has that been filed yet? -- Andrei https://issues.dlang.org/show_bug.cgi?id=14114
Re: H1 2015 Priorities and Bare-Metal Programming
Am Mon, 02 Feb 2015 08:55:59 -0800 schrieb Andrei Alexandrescu seewebsiteforem...@erdani.org: On 2/2/15 8:42 AM, Johannes Pfau wrote: Again the problem is not volatileLoad/Store which translate to single instructions it's wrappers. So does the argument boil down to better inlining control and enforcement? -- Andrei Mostyl that, but not only that. It's also necessary that the compiler knows after inlining that the address is a literal. Loading data from fixed literal addresses produces different, more efficient code than loading from an runtime address. As the function code will generally be written for runtime values the compiler must optimize after inlining to recognize the inlined code deals with literals. The GCC backend performs these optimizations only if optimization is enabled. We could always do this in the dmd frontend inliner but LDC and GDC don't/can't use the frontend inliner. That's lot of work given that pragma(address) is a simple, consistent solution and not even a real language change.
Re: H1 2015 Priorities and Bare-Metal Programming
On 2 February 2015 at 17:06, Johannes Pfau via Digitalmars-d digitalmars-d@puremagic.com wrote: Am Mon, 02 Feb 2015 02:49:48 -0800 schrieb Walter Bright newshou...@digitalmars.com: On 2/2/2015 1:24 AM, Johannes Pfau wrote: Usually those people just don't use volatile as long as their code works. Once it breaks they add volatile everywhere till it works again. Having a volatile type is not going to make things better for such people. In fact, it may make things worse. It's harder to muck up volatileLoad() and volatileStore(). Wrong: wrappers are provided by the library. Users simply do stuff like PORTB.pin0 = Level.high; they never have to use volatile directly. And it's actually the same in C which provides macros. I wrote this in reply to your statements that wrappers are not necessarily, of course you didn't quote that part. Without wrappers, end users do have to use volatileLoad/Store by themselves and now they do need to know what volatile is, where it's necessary, ... The compiler intrinsics participate in all optimizations. Not sure what that's supposed to mean. The backend can generate more efficient code if it knows that an address is a literal value. If you add wrappers (void f(void* p) {volatileLoad(p)}) the information that p is a constant is a literal is lost and needs to be recovered by the backend, which is only done with enabled backend optimizations. Please try it before deciding it does not work. I guess one ad hominem wasn't enough? http://goo.gl/Y9OFgG The error DMD gives when optimizations are turned on is comical at best too. Iain.
Re: H1 2015 Priorities and Bare-Metal Programming
On Mon, Feb 02, 2015 at 08:55:59AM -0800, Andrei Alexandrescu via Digitalmars-d wrote: On 2/2/15 8:42 AM, Johannes Pfau wrote: Again the problem is not volatileLoad/Store which translate to single instructions it's wrappers. So does the argument boil down to better inlining control and enforcement? -- Andrei FWIW, from the POV of a bystander, the point is that force-inline for certain things (namely wrappers around certain intrinsics, like operator overloads for atomics, what-have-you) is necessary in order to guarantee that function call overhead will not be incurred no matter what. Walter seems to dislike forced inlining for various reasons, preferring inlining as a hint at the most, and he probably has a point in most cases (let the compiler make the judgment). But in other cases, such as the one in question, the user needs to override the compiler's decision. Currently there's no way to do that, and it's a showstopper for those users. Unless we're proposing to flood the compiler with intrinsics, one for every possible operator overload of volatile loads/stores, which I think should be obviously infeasible. And even then, you still might miss one or two other obscure wrappers that users might discover that they need. It seems reasonable that instead of burdening the compiler (and compiler maintainers, and porters) to keep up with an ever-expanding set of intrinsics, making use of the language to express what is needed via forced-inline functions is a better way to do things. T -- If it's green, it's biology, If it stinks, it's chemistry, If it has numbers it's math, If it doesn't work, it's technology.
Re: H1 2015 Priorities and Bare-Metal Programming
On 2/2/15 12:06 PM, Johannes Pfau wrote: Am Mon, 02 Feb 2015 02:49:48 -0800 schrieb Walter Bright newshou...@digitalmars.com: Please try it before deciding it does not work. I guess one ad hominem wasn't enough? Sorry, I'm not really vested in this discussion at all, but I don't think you realize what ad hominem means. http://en.wikipedia.org/wiki/Ad_hominem -Steve
Re: H1 2015 Priorities and Bare-Metal Programming
On Monday, 2 February 2015 at 16:55:59 UTC, Andrei Alexandrescu wrote: I think it's time to reopen that negotiation. +1 So does the argument boil down to better inlining control and enforcement? -- Andrei If we reopen this I think we should start at the beginning and not yet concentrate implementation details. The discussion should not be developers against users. Developers make things _for_ users. If this was a commercial product, lack of listening users needs would be fatal to the company. The examples so far have been around a single register. There are single registers in 8 bit processors. Modern 32 bit processors have register banks that have tens of registers, 32 bit each. They are accessed trough structs that may contain arrays, substructs etc. It would be better that the solution we will choose would apply to the whole structure and transitively to all its members. An example that is tyipcal in real use 1 regs.ctrl |= 0x20; // select some mode 2 regs.ctrl |= 0x1000; // transmitter on 3 foreach ( b ; buf ) // send a buffer of bytes { 4 while ((regs.status 0x40) ==0) {} // wait that the transmitter is ready 5 regs.data = b; // send the byte } 6 regs.ctrl = ~0x20; // transmitter off 7 c=regs.data; // look if there is something to receive In here the regs struc represents the registers of some peripheral What the compiler thinks? 1 and 2 are removed because 6 will overwrite the variable anyway. 4 may be moved before 3 because status is not changed in the loop. The loop may be removed totally because the last of 5 overwrites the previous anyway. 7 does not read the register because it uses cached data from 5 instead. I want to use basio operators and language features to access registers, not templates or functions or wrappers. I just hope we have one word, that I will add to the definition of the register struct and then the struct would behave as expected. I do not care if it is a pragma or a keyword or a property or whatever, but it has to be something in the definition and not something I have to type every time I read or write a register.
Re: H1 2015 Priorities and Bare-Metal Programming
Am Mon, 02 Feb 2015 02:49:48 -0800 schrieb Walter Bright newshou...@digitalmars.com: On 2/2/2015 1:24 AM, Johannes Pfau wrote: Usually those people just don't use volatile as long as their code works. Once it breaks they add volatile everywhere till it works again. Having a volatile type is not going to make things better for such people. In fact, it may make things worse. It's harder to muck up volatileLoad() and volatileStore(). Wrong: wrappers are provided by the library. Users simply do stuff like PORTB.pin0 = Level.high; they never have to use volatile directly. And it's actually the same in C which provides macros. I wrote this in reply to your statements that wrappers are not necessarily, of course you didn't quote that part. Without wrappers, end users do have to use volatileLoad/Store by themselves and now they do need to know what volatile is, where it's necessary, ... The compiler intrinsics participate in all optimizations. Not sure what that's supposed to mean. The backend can generate more efficient code if it knows that an address is a literal value. If you add wrappers (void f(void* p) {volatileLoad(p)}) the information that p is a constant is a literal is lost and needs to be recovered by the backend, which is only done with enabled backend optimizations. Please try it before deciding it does not work. I guess one ad hominem wasn't enough? http://goo.gl/Y9OFgG _Dmain: push rbp movrbp,rsp subrsp,0x10 movrax,0x5 == movQWORD PTR [rbp-0x8],rax movecx,DWORD PTR [rax] == a register based load The instruction it should generate is mov ecx, [0x5] Not sure if it's actually more efficient on X86 but it makes a huge difference on real microcontroller architectures.
Re: H1 2015 Priorities and Bare-Metal Programming
On 2/2/15 9:23 AM, Iain Buclaw via Digitalmars-d wrote: That code doesn't work with DMD. http://goo.gl/hgsHg0 Has that been filed yet? -- Andrei
Re: H1 2015 Priorities and Bare-Metal Programming
On 2/2/15 9:06 AM, Johannes Pfau wrote: I guess one ad hominem wasn't enough? Please cool it will you? That doesn't quite qualify. -- Andrei
Re: H1 2015 Priorities and Bare-Metal Programming
On 2 February 2015 at 05:48, Walter Bright via Digitalmars-d digitalmars-d@puremagic.com wrote: On 2/1/2015 9:21 PM, Daniel Murphy wrote: Walter Bright wrote in message news:mam6qe$15nu$1...@digitalmars.com... We also need a pragma(address) to complement pragma(mangle). What would that do? It would allow naming a memory address, similar to using .org in assembly. eg pragma(address, 0x0025) shared ubyte PORTB; static assert(PORTB == cast(ubyte*)0x0025); This is a much nicer version of C's #define PORTB (*(volatile unsigned char *)0x0025) That's what I suspected :-) struct Ports { static ubyte B() { return volatileLoad(cast(ubyte *)0x0025); } static void B(ubyte value) { volatileStore(cast(ubyte *)0x0025, value); } } ... Ports.B = 7; foo(Ports.B); gets the job done. Of course, you could take it further and make a template out of it: auto Ports = Port!(ubyte, 0x0025); That code doesn't work with DMD. http://goo.gl/hgsHg0 Iain.
Re: H1 2015 Priorities and Bare-Metal Programming
On 2/2/15 9:15 AM, Johannes Pfau wrote: Am Mon, 02 Feb 2015 08:55:59 -0800 schrieb Andrei Alexandrescu seewebsiteforem...@erdani.org: On 2/2/15 8:42 AM, Johannes Pfau wrote: Again the problem is not volatileLoad/Store which translate to single instructions it's wrappers. So does the argument boil down to better inlining control and enforcement? -- Andrei Mostyl that, but not only that. It's also necessary that the compiler knows after inlining that the address is a literal. Loading data from fixed literal addresses produces different, more efficient code than loading from an runtime address. As the function code will generally be written for runtime values the compiler must optimize after inlining to recognize the inlined code deals with literals. The GCC backend performs these optimizations only if optimization is enabled. We could always do this in the dmd frontend inliner but LDC and GDC don't/can't use the frontend inliner. That's lot of work given that pragma(address) is a simple, consistent solution and not even a real language change. I suggest we push forward with better inlining control and better optimizations, but not pragma(address). -- Andrei
Re: H1 2015 Priorities and Bare-Metal Programming
On 2/1/15 8:41 PM, Joakim wrote: On Sunday, 1 February 2015 at 23:41:22 UTC, Andrei Alexandrescu wrote: There's something we need to explain about the vision document itself. Do I want to see more of Mike's awesome work in D going forward? Yes. Do I want to see D on mobile? Of course. There's a lot of stuff that Walter and I would like to see happen that's not in the document. The document itself includes things that he and I actually believe we can work on and make happen. (In the case of vibe.d, we made sure we asked Sönke.) It doesn't include awesome things that others can do without our help - and it shouldn't. Yes, this needs to be emphasized, as it isn't obvious that you limited it only to stuff that you and Walter can personally enable. Perhaps you can expand the wiki page with a section for stuff that you two would like to see, but cannot personally enable. Historically, the D community has been horrible at communicating goals like this, with all current efforts buried in miles of mailing list threads that few outsiders are ever going to wade through, if that. By putting these goals on the wiki, even if you can't personally enable them, someone might see a goal, decide they'd like to have that too, and start working on it, secure in the knowledge that it's wanted and is likely to be merged if certain quality standards are met. I suggest you add another clearly labeled section with such goals, what you would like to see happen but cannot work on and make happen. The community might pick those up and run with them without you. OK, did so. Thanks! -- Andrei
Re: H1 2015 Priorities and Bare-Metal Programming
On 2/1/2015 11:16 PM, Iain Buclaw via Digitalmars-d wrote: Where is the @property? :) yeah, yeah !!
Re: H1 2015 Priorities and Bare-Metal Programming
Am Sun, 01 Feb 2015 13:45:24 -0800 schrieb Walter Bright newshou...@digitalmars.com: On 2/1/2015 3:22 AM, Johannes Pfau wrote: Am Sun, 01 Feb 2015 02:11:42 -0800 schrieb Walter Bright newshou...@digitalmars.com: core.bitop.volatileLoad() and volatileStore() are implemented, and do the job. They are compiler intrinsics that result in single instructions. They support 8, 16, 32 and 64 bit loads and stores. I think everybody agreed that these low-level primitives can't be used in end-user code. I apparently missed that discussion. (In any case, dealing with memory mapped I/O is not a usual programming task, I expect a programmer doing it will be more sophisticated.) You keep saying that, but it's simply not true. It's a common task when programming microcontrollers. And people working on microcontrollers are often electrical engineers. Programming beginner courses in EE teach almost nothing, especially not why volatile is necessary or how it propagates with pointers etc. Usually those people just don't use volatile as long as their code works. Once it breaks they add volatile everywhere till it works again. We can generate nice wrappers (nicer than C code), which perform as well as C code, but only with force-inline _and_ enabled optimizations (we essentially need heavy constant folding). The compiler intrinsics participate in all optimizations. Not sure what that's supposed to mean. The backend can generate more efficient code if it knows that an address is a literal value. If you add wrappers (void f(void* p) {volatileLoad(p)}) the information that p is a constant is a literal is lost and needs to be recovered by the backend, which is only done with enabled backend optimizations.
Re: H1 2015 Priorities and Bare-Metal Programming
On Sunday, 1 February 2015 at 10:14:28 UTC, Walter Bright wrote: Please post to bugzilla and tag the issue with bare-metal. https://issues.dlang.org/show_bug.cgi?id=14101 Please have a look and correct if necessary.
Re: H1 2015 Priorities and Bare-Metal Programming
On 2/1/2015 2:12 AM, eles wrote: On Sunday, 1 February 2015 at 10:11:57 UTC, eles wrote: The absolute minimum set of changes that I had to make can be seen here: https://bitbucket.org/timosi/minlib/src/8674af49718880021c2777d60ac2091bc99c0107/Changes?at=default corrected link (I think): https://bitbucket.org/timosi/minlibd/src/8674af49718880021c2777d60ac2091bc99c0107/Changes?at=default And, for info, pasted here: This file is not a changelog. Instead, this file documents changes that are needed to the current gdc libdruntime sources. (2.065 and 4.10 series gdc/gcc) Please post to bugzilla and tag the issue with bare-metal.
Re: H1 2015 Priorities and Bare-Metal Programming
Am Sun, 01 Feb 2015 02:11:42 -0800 schrieb Walter Bright newshou...@digitalmars.com: On 2/1/2015 1:38 AM, Timo Sintonen wrote: The one of major issues is: how to access hardware. We need a language feature to access hardware registers. This has been discussed twice. Both time you rejected anything but your own idea of library functions. You rejected anything anybody said. No serious programmer will write code that way. It worked in 80's when we had an uart with three registers 8 bit each. Now an usb or ethernet peripheral may have 100 registers 32 bit each. core.bitop.volatileLoad() and volatileStore() are implemented, and do the job. They are compiler intrinsics that result in single instructions. They support 8, 16, 32 and 64 bit loads and stores. I think everybody agreed that these low-level primitives can't be used in end-user code. We can generate nice wrappers (nicer than C code), which perform as well as C code, but only with force-inline _and_ enabled optimizations (we essentially need heavy constant folding). We also need a pragma(address) to complement pragma(mangle). Here's some proof-of concept running D on an 8bit AVR: https://github.com/D-Programming-microD/avr-playground/blob/master/src/test.d WIP implementation of Volatile!T and a register string mixin: http://pastebin.com/sb58UW00 Volatile!T: Line 16 Volatile!T usage: Line 73 generateRegisterType: Line 150 usage: Line 353 sample output: 356 So I'd say there are not too many language problems, the main problem is runtime/compiler interaction: * If you don't want to use any runtime at all that's actually the easier part. We'd need to implement a little more of betterC but this can be done easily. Mike would prefer the compiler to autodetect the capabilities of the runtime (implemented hooks) instead of compiler switches. That'd be better but some more work. * Using only part of druntime is ugly. The one thing most people would probably like to strip out is the GC, but keep exception handling, threads, ... But the GC is everywhere: core.demangle doesn't work without, backtraces, exceptions, threads. Right now you either use all of druntime or nothing but it's not possible to use parts of druntime only, it's not modular enough.
Re: H1 2015 Priorities and Bare-Metal Programming
On Sunday, 1 February 2015 at 06:37:27 UTC, Walter Bright wrote: I don't recall what you've suggested in this vein that was very unpopular - can you please post an example? These are not in any particular order. They are just how I remember them. I don't care if everyone disagrees with them, but Rust has shown the benefits of a minimal, nimble runtime with modular language implementation [7], and I don't see why D can't compete and offer something even better. Issue 11666 - Separate each platform's port to its own folder/file [1] -- Originally Ian Buclaw's idea, but the Bugzilla issue was filed by me. The problem is it is difficult to see the abstractions in the runtime, because one has to navigate through hierarchies of `version`ing, making it difficult to port D to new platforms. The solution is to organize it by platform/architecture separating each into its own folder. While there were a few pull requests attempting to address this, they all failed, mostly due to the fact that participants could not agree how to organize the Linux/Posix platform headers. That made me wonder: What in the world are the the platform bindings doing in druntime anyway, and why should they interfere 11666? druntime is supposed to be the language implementation. If platform bindings are needed, fine, get them from Deimos, and encapsulate them. If users need platform bindings, they can get the from Deimos too, and the linker can sort things out. I brought this up when the community was considering adding C++ standard library language bindings to druntime as well (yikes!). I tried to intervene, and took a beating [2]. I was criticised for whining instead of contributing with pull requests. Well, I did submit a pull request for the most unintrusive, uncontroversial, trivial thing I could think of to get the ball rolling [3]. But I was told I should discuss it on the forum. I decided to stop the ridiculous cycle of discussion and debate, and did not go back to the forum, as it was clear where that would have ended up. If druntime was simply the language implementation, 11666 would have been resolved without debate, and we'd have a well-structured code base for the community to gradually and incrementally bring D to more platforms and architectures, without having to go their own way with their own runtime and their own tools. I think the stuff in core.sys.* needs to be deported to Deimos, and privately imported by any ports that may need it. The current architecture of druntime is not very portable or scalable, especially to the bare-metal architectures. Issue 13605 - Add ability to `version` a module declaration [4] --- One of the issues brought up in 11666 was that using public imports introduces new namespaces. So, in my own experiments, I found that if I only had a way to `version` a module, I could potentially achieve better encapsulation of a given port. So, I filed this enhancement request. Although this enhancement changes nothing for anyone, and simply removes an arbitrary limitation, it was still controversial. Issue 12270 - Move TypeInfo to the D Runtime [5] If you remember from my and Adam Ruppe's DConf talk, we both mentioned the need to stub out TypeInfo in our runtime. It's a silly, but effective hack. Adam proposed an idea to move TypeInfo to the runtime, and I decided to log it in Bugzilla because I thought it was a great idea. But more importantly it's a great example for future precedent. There seems to be something inherently wrong with the way the compiler is coupled to the runtime. I can't put my finger on it, but perhaps you know what I'm talking about. It seems the compiler needs to decouple from and delegate more to the runtime. Issue 12270 is a great start in that direction, but unfortunately I seem to be the only one of two actually interested in it, and I'm disappointed to say I don't know how to implement it. Furthermore, I suspect this may actually break a few things if it were implemented, and given the aversion to change in this community, my current belief is it would not be accepted even if it were implemented. Moving runtime hook declarations to .di files in the runtime [6] I want my runtime to inform the compiler which language features it supports have the compiler generate errors, at compile-time, if the user attempts to use a language feature that is not implemented. Frustrated with the push-back from the larger community, I decided to take a lower profile and see if I could have more impact from the bottom up. I found GDC's code base a little easier to understand than DMD's, and the few bare-metal folks there are in this community all seem
Re: H1 2015 Priorities and Bare-Metal Programming
On Sunday, 1 February 2015 at 11:22:04 UTC, Johannes Pfau wrote: So I'd say there are not too many language problems, the main problem is runtime/compiler interaction: * If you don't want to use any runtime at all that's actually the easier part. We'd need to implement a little more of betterC but this can be done easily. Mike would prefer the compiler to autodetect the capabilities of the runtime (implemented hooks) instead of compiler switches. That'd be better but some more work. * Using only part of druntime is ugly. The one thing most people would probably like to strip out is the GC, but keep exception handling, threads, ... But the GC is everywhere: core.demangle doesn't work without, backtraces, exceptions, threads. Right now you either use all of druntime or nothing but it's not possible to use parts of druntime only, it's not modular enough. Yes, I totally agree with this assessment. Mike
Re: H1 2015 Priorities and Bare-Metal Programming
On Sunday, 1 February 2015 at 11:22:04 UTC, Johannes Pfau wrote: which perform as well as C code, but only with force-inline why is this still not part of the language? I'm not sure of anything else that has been repeatedly asked for without any good counterarguments.