Re: Performance issue in struct initialization
Am Mon, 20 Jun 2016 20:34:12 + schrieb Basile B.: > On Monday, 20 June 2016 at 11:45:28 UTC, Johannes Pfau wrote: > > Am Sun, 19 Jun 2016 20:52:52 + > > schrieb deadalnix : > > > >> On Sunday, 19 June 2016 at 11:11:18 UTC, Basile B. wrote: > >> > On Saturday, 23 April 2016 at 13:37:31 UTC, Andrei > >> > Alexandrescu wrote: > >> >> https://issues.dlang.org/show_bug.cgi?id=15951. I showed a > >> >> few obvious cases, but finding the best code in general is > >> >> tricky. Ideas? -- Andrei > >> > > >> > A new "@noinit" attribute could solve this issue and other > >> > cases where the initializer is a handicap: > >> > > >> > The runtime would skip the copy of the initializer when > >> > 1- @noinit is an attribute of an aggregate. > >> > 2- a ctor that takes at least one parameter is present. > >> > 3- the default ctor is disabled (only a condition for the > >> > structs or the new free form unions) > >> > > >> > // OK > >> > @noinit struct Foo > >> > { > >> >uint a; > >> >@disable this(); > >> >this(uint a){} > >> > } > >> > > >> > // not accepted because a ctor with parameters misses > >> > @noinit struct Foo > >> > { > >> >@disable this(); > >> > } > >> > > >> > // Ok but a warning will be emitted... > >> > @noinit struct Foo > >> > { > >> >uint a = 1; // ...because this value is ignored > >> >@disable this(); > >> >this(uint a){} > >> > } > >> > > >> > // not accepted because there's a default ctor > >> > @noinit struct Foo > >> > { > >> >this(){} > >> > } > >> > > >> > The rationale is that when there's a constructor that takes > >> > parameters it's really suposed to initialize the aggregate. > >> > At least that would be the contract, the "postulate', put by > >> > the usage of @noinit. > >> > >> No new attribute please. Just enable the damn thing where > >> there is an argumentless constructor and be done with it. > > > > Can somebody explain how exactly are constructors related to > > the problem? > > The initializer is copied to the chunk that represents the new > aggregate instance. The idea here is to explicitly disable this > copy to get a faster instantiation, under certain conditions. For > example in allocator.make() this would mean "skip the call to > emplace() and call directly __ctor() on the new chunk". > > > If I've got this: > > struct Foo > > { > > int a = 42; > > int b = void; > > > > @disable this(); > > this(int b) > > {this.b = b;} > > } > > auto foo = Foo(41); > > > > I'd still expect a to be initialized to 42. > > That's exactly why with @noinit you would get a warning > > > Note that this does _not_ require a initializer symbol or > > memcpy. > > I'be verified again and the initializer is copied. For example > with a gap in the static initial values: I meant this does not have to use a symbol. Right now it does, but that's an implementation issue. > > > struct Foo > { > int a = 7; > int gap = void; > int c = 8; > @disable this(); > this(int a) > {this.a = a;} > } > auto fun(){ auto foo = Foo(41); return foo.a;} > > > I get (-O -release -inline) for fun(): > > 00457D58h sub rsp, 18h > 00457D5Ch mov esi, 004C9390h // > typid(Foo).initializer.ptr > 00457D61h lea rdi, qword ptr [rsp+08h] > 00457D66h movsq //copy 8, note that the gap is not > handled at all > 00457D68h movsb //copy 1 > 00457D69h movsb //copy 1 > 00457D6Ah movsb //copy 1 > 00457D6Bh movsb //copy 1 > 00457D6Ch mov eax, 0029h //inlined __ctor > 00457D71h mov dword ptr [rsp+08h], eax > 00457D75h add rsp, 18h > 00457D79h ret > > But that was obvious. How would you expect a = 7 and c = 8 to be > generated otherwise ? Instead of doing foo = Foo.init(symbol) you could do foo = {a:7, c:8}(literal) (in the compiler). Then you don't need memory to memory moves at all if your architecure allows hardcoding constants in instructions. Additionally this allows the optimizer to see if you're writing to a default initialized variable and remove the useless initialization. If you don't have any default initializers / all are =void, foo = {} is a no-op. With @noinit using default initializers causes a warning, with my idea using default initializers works fine and if you don't want any, just set all fields to =void. (This does not produce optimal code right now, but considering the =void fields when initializing is a simple change in GDC) See https://issues.dlang.org/show_bug.cgi?id=15951#c4 for details. > > with @noinit you would get > > sub rsp, 18h > mov eax, 0029h //inlined __ctor > mov dword ptr [rsp+08h], eax > add rsp, 18h > ret > > That's a really simple and pragmatic idea. But I guess that if > you manage to get the compiler to generate a smarter initializer > copy then the problem is fixed. At least I'll experiment this >
Re: Performance issue in struct initialization
On Monday, 20 June 2016 at 11:45:28 UTC, Johannes Pfau wrote: Am Sun, 19 Jun 2016 20:52:52 + schrieb deadalnix: On Sunday, 19 June 2016 at 11:11:18 UTC, Basile B. wrote: > On Saturday, 23 April 2016 at 13:37:31 UTC, Andrei > Alexandrescu wrote: >> https://issues.dlang.org/show_bug.cgi?id=15951. I showed a >> few obvious cases, but finding the best code in general is >> tricky. Ideas? -- Andrei > > A new "@noinit" attribute could solve this issue and other > cases where the initializer is a handicap: > > The runtime would skip the copy of the initializer when > 1- @noinit is an attribute of an aggregate. > 2- a ctor that takes at least one parameter is present. > 3- the default ctor is disabled (only a condition for the > structs or the new free form unions) > > // OK > @noinit struct Foo > { >uint a; >@disable this(); >this(uint a){} > } > > // not accepted because a ctor with parameters misses > @noinit struct Foo > { >@disable this(); > } > > // Ok but a warning will be emitted... > @noinit struct Foo > { >uint a = 1; // ...because this value is ignored >@disable this(); >this(uint a){} > } > > // not accepted because there's a default ctor > @noinit struct Foo > { >this(){} > } > > The rationale is that when there's a constructor that takes > parameters it's really suposed to initialize the aggregate. > At least that would be the contract, the "postulate', put by > the usage of @noinit. No new attribute please. Just enable the damn thing where there is an argumentless constructor and be done with it. Can somebody explain how exactly are constructors related to the problem? The initializer is copied to the chunk that represents the new aggregate instance. The idea here is to explicitly disable this copy to get a faster instantiation, under certain conditions. For example in allocator.make() this would mean "skip the call to emplace() and call directly __ctor() on the new chunk". If I've got this: struct Foo { int a = 42; int b = void; @disable this(); this(int b) {this.b = b;} } auto foo = Foo(41); I'd still expect a to be initialized to 42. That's exactly why with @noinit you would get a warning Note that this does _not_ require a initializer symbol or memcpy. I'be verified again and the initializer is copied. For example with a gap in the static initial values: struct Foo { int a = 7; int gap = void; int c = 8; @disable this(); this(int a) {this.a = a;} } auto fun(){ auto foo = Foo(41); return foo.a;} I get (-O -release -inline) for fun(): 00457D58h sub rsp, 18h 00457D5Ch mov esi, 004C9390h // typid(Foo).initializer.ptr 00457D61h lea rdi, qword ptr [rsp+08h] 00457D66h movsq //copy 8, note that the gap is not handled at all 00457D68h movsb //copy 1 00457D69h movsb //copy 1 00457D6Ah movsb //copy 1 00457D6Bh movsb //copy 1 00457D6Ch mov eax, 0029h //inlined __ctor 00457D71h mov dword ptr [rsp+08h], eax 00457D75h add rsp, 18h 00457D79h ret But that was obvious. How would you expect a = 7 and c = 8 to be generated otherwise ? with @noinit you would get sub rsp, 18h mov eax, 0029h //inlined __ctor mov dword ptr [rsp+08h], eax add rsp, 18h ret That's a really simple and pragmatic idea. But I guess that if you manage to get the compiler to generate a smarter initializer copy then the problem is fixed. At least I'll experiment this noinit stuff in my user library.
Re: Performance issue in struct initialization
Am Sun, 19 Jun 2016 20:52:52 + schrieb deadalnix: > On Sunday, 19 June 2016 at 11:11:18 UTC, Basile B. wrote: > > On Saturday, 23 April 2016 at 13:37:31 UTC, Andrei Alexandrescu > > wrote: > >> https://issues.dlang.org/show_bug.cgi?id=15951. I showed a few > >> obvious cases, but finding the best code in general is tricky. > >> Ideas? -- Andrei > > > > A new "@noinit" attribute could solve this issue and other > > cases where the initializer is a handicap: > > > > The runtime would skip the copy of the initializer when > > 1- @noinit is an attribute of an aggregate. > > 2- a ctor that takes at least one parameter is present. > > 3- the default ctor is disabled (only a condition for the > > structs or the new free form unions) > > > > // OK > > @noinit struct Foo > > { > >uint a; > >@disable this(); > >this(uint a){} > > } > > > > // not accepted because a ctor with parameters misses > > @noinit struct Foo > > { > >@disable this(); > > } > > > > // Ok but a warning will be emitted... > > @noinit struct Foo > > { > >uint a = 1; // ...because this value is ignored > >@disable this(); > >this(uint a){} > > } > > > > // not accepted because there's a default ctor > > @noinit struct Foo > > { > >this(){} > > } > > > > The rationale is that when there's a constructor that takes > > parameters it's really suposed to initialize the aggregate. At > > least that would be the contract, the "postulate', put by the > > usage of @noinit. > > No new attribute please. Just enable the damn thing where there > is an argumentless constructor and be done with it. Can somebody explain how exactly are constructors related to the problem? If I've got this: struct Foo { int a = 42; int b = void; @disable this(); this(int b) {this.b = b;} } auto foo = Foo(41); I'd still expect a to be initialized to 42. Note that this does _not_ require a initializer symbol or memcpy. I'll post a more detailed follow up post to issue 15951.
Re: Performance issue in struct initialization
Am Sat, 28 May 2016 07:08:52 + schrieb Era Scarecrow: > On Friday, 27 May 2016 at 09:02:17 UTC, Johan Engelen wrote: > > That language guarantee prevents optimization of the > > initialization (in this case, the optimized result would be no > > initialization at all). So a breaking language spec change > > would be needed. Is this pursued by anyone? Perhaps only relax > > the spec when the struct S overrides opEquals ? > > > > (Once the optimization is allowed, I think it will be a fun > > project for me to implement it in LDC. But please keep the > > discussion clean by not discussing how a compiler should make > > use of this language change, how to implement it, etc. Thanks!) > > If opEquals and opCmp are overridden then I don't see why voids > in initialization can't work since how you are comparing it would > determine equality and not a bitwise compare... > I don't think this is a good solution. A '=void' field will always break 'a is b', even with a user defined opEquals. And of course a custom opEquals could do bit comparisons as well. I think a better way to frame this for the spec is: * A struct containing a =void field (directly or nested) cannot be compared bitwise. The compiler is allowed to fill the =void parts with arbitrary data. a == b will only compile if a user provided opEquals is available. a is b will never compile for such structs.
Re: Performance issue in struct initialization
On Sunday, 19 June 2016 at 20:52:52 UTC, deadalnix wrote: On Sunday, 19 June 2016 at 11:11:18 UTC, Basile B. wrote: On Saturday, 23 April 2016 at 13:37:31 UTC, Andrei Alexandrescu wrote: https://issues.dlang.org/show_bug.cgi?id=15951. I showed a few obvious cases, but finding the best code in general is tricky. Ideas? -- Andrei A new "@noinit" attribute could solve this issue and other cases where the initializer is a handicap: The runtime would skip the copy of the initializer when 1- @noinit is an attribute of an aggregate. 2- a ctor that takes at least one parameter is present. 3- the default ctor is disabled (only a condition for the structs or the new free form unions) // OK @noinit struct Foo { uint a; @disable this(); this(uint a){} } // not accepted because a ctor with parameters misses @noinit struct Foo { @disable this(); } // Ok but a warning will be emitted... @noinit struct Foo { uint a = 1; // ...because this value is ignored @disable this(); this(uint a){} } // not accepted because there's a default ctor @noinit struct Foo { this(){} } The rationale is that when there's a constructor that takes parameters it's really suposed to initialize the aggregate. At least that would be the contract, the "postulate', put by the usage of @noinit. No new attribute please. Just enable the damn thing where there is an argumentless constructor and be done with it. One , if not the one I like the best: https://www.youtube.com/watch?v=xNRBajLM8_4
Re: Performance issue in struct initialization
On Sunday, 19 June 2016 at 20:52:52 UTC, deadalnix wrote: On Sunday, 19 June 2016 at 11:11:18 UTC, Basile B. wrote: On Saturday, 23 April 2016 at 13:37:31 UTC, Andrei Alexandrescu wrote: https://issues.dlang.org/show_bug.cgi?id=15951. I showed a few obvious cases, but finding the best code in general is tricky. Ideas? -- Andrei A new "@noinit" attribute could solve this issue and other cases where the initializer is a handicap: The runtime would skip the copy of the initializer when 1- @noinit is an attribute of an aggregate. 2- a ctor that takes at least one parameter is present. 3- the default ctor is disabled (only a condition for the structs or the new free form unions) // OK @noinit struct Foo { uint a; @disable this(); this(uint a){} } // not accepted because a ctor with parameters misses @noinit struct Foo { @disable this(); } // Ok but a warning will be emitted... @noinit struct Foo { uint a = 1; // ...because this value is ignored @disable this(); this(uint a){} } // not accepted because there's a default ctor @noinit struct Foo { this(){} } The rationale is that when there's a constructor that takes parameters it's really suposed to initialize the aggregate. At least that would be the contract, the "postulate', put by the usage of @noinit. No new attribute please. Just enable the damn thing where there is an argumentless constructor and be done with it. we need hobo structs...
Re: Performance issue in struct initialization
On Sunday, 19 June 2016 at 11:11:18 UTC, Basile B. wrote: On Saturday, 23 April 2016 at 13:37:31 UTC, Andrei Alexandrescu wrote: https://issues.dlang.org/show_bug.cgi?id=15951. I showed a few obvious cases, but finding the best code in general is tricky. Ideas? -- Andrei A new "@noinit" attribute could solve this issue and other cases where the initializer is a handicap: The runtime would skip the copy of the initializer when 1- @noinit is an attribute of an aggregate. 2- a ctor that takes at least one parameter is present. 3- the default ctor is disabled (only a condition for the structs or the new free form unions) // OK @noinit struct Foo { uint a; @disable this(); this(uint a){} } // not accepted because a ctor with parameters misses @noinit struct Foo { @disable this(); } // Ok but a warning will be emitted... @noinit struct Foo { uint a = 1; // ...because this value is ignored @disable this(); this(uint a){} } // not accepted because there's a default ctor @noinit struct Foo { this(){} } The rationale is that when there's a constructor that takes parameters it's really suposed to initialize the aggregate. At least that would be the contract, the "postulate', put by the usage of @noinit. No new attribute please. Just enable the damn thing where there is an argumentless constructor and be done with it.
Re: Performance issue in struct initialization
On Sunday, 19 June 2016 at 11:11:18 UTC, Basile B. wrote: On Saturday, 23 April 2016 at 13:37:31 UTC, Andrei Alexandrescu wrote: https://issues.dlang.org/show_bug.cgi?id=15951. I showed a few obvious cases, but finding the best code in general is tricky. Ideas? -- Andrei A new "@noinit" attribute could solve this issue and other cases where the initializer is a handicap: The runtime would skip the copy of the initializer when 1- @noinit is an attribute of an aggregate. 2- a ctor that takes at least one parameter is present. 3- the default ctor is disabled (only a condition for the structs or the new free form unions) // OK @noinit struct Foo { uint a; @disable this(); this(uint a){} } // not accepted because a ctor with parameters misses @noinit struct Foo { @disable this(); } // Ok but a warning will be emitted... @noinit struct Foo { uint a = 1; // ...because this value is ignored @disable this(); this(uint a){} } // not accepted because there's a default ctor @noinit struct Foo { this(){} } The rationale is that when there's a constructor that takes parameters it's really suposed to initialize the aggregate. At least that would be the contract, the "postulate', put by the usage of @noinit. If RCStrings are always allocated with allocators then we can implement this at the library level...otherwise it's a DIP ;)
Re: Performance issue in struct initialization
On Saturday, 23 April 2016 at 13:37:31 UTC, Andrei Alexandrescu wrote: https://issues.dlang.org/show_bug.cgi?id=15951. I showed a few obvious cases, but finding the best code in general is tricky. Ideas? -- Andrei A new "@noinit" attribute could solve this issue and other cases where the initializer is a handicap: The runtime would skip the copy of the initializer when 1- @noinit is an attribute of an aggregate. 2- a ctor that takes at least one parameter is present. 3- the default ctor is disabled (only a condition for the structs or the new free form unions) // OK @noinit struct Foo { uint a; @disable this(); this(uint a){} } // not accepted because a ctor with parameters misses @noinit struct Foo { @disable this(); } // Ok but a warning will be emitted... @noinit struct Foo { uint a = 1; // ...because this value is ignored @disable this(); this(uint a){} } // not accepted because there's a default ctor @noinit struct Foo { this(){} } The rationale is that when there's a constructor that takes parameters it's really suposed to initialize the aggregate. At least that would be the contract, the "postulate', put by the usage of @noinit.
Re: Performance issue in struct initialization
On Friday, 27 May 2016 at 09:02:17 UTC, Johan Engelen wrote: That language guarantee prevents optimization of the initialization (in this case, the optimized result would be no initialization at all). So a breaking language spec change would be needed. Is this pursued by anyone? Perhaps only relax the spec when the struct S overrides opEquals ? (Once the optimization is allowed, I think it will be a fun project for me to implement it in LDC. But please keep the discussion clean by not discussing how a compiler should make use of this language change, how to implement it, etc. Thanks!) If opEquals and opCmp are overridden then I don't see why voids in initialization can't work since how you are comparing it would determine equality and not a bitwise compare... Hmmm had a longer reply involving changes to a new struct type with a few changes...
Re: Performance issue in struct initialization
https://issues.dlang.org/show_bug.cgi?id=15951 https://issues.dlang.org/show_bug.cgi?id=11817 https://issues.dlang.org/show_bug.cgi?id=11331 What I gather from the discussions is that the current spec says that: ``` struct S { char[100] arr = void; } S a; S b; assert(a == b); ``` That language guarantee prevents optimization of the initialization (in this case, the optimized result would be no initialization at all). So a breaking language spec change would be needed. Is this pursued by anyone? Perhaps only relax the spec when the struct S overrides opEquals ? (Once the optimization is allowed, I think it will be a fun project for me to implement it in LDC. But please keep the discussion clean by not discussing how a compiler should make use of this language change, how to implement it, etc. Thanks!) cheers, Johan
Re: Performance issue in struct initialization
On Saturday, 23 April 2016 at 14:10:06 UTC, Basile B wrote: On Saturday, 23 April 2016 at 13:37:31 UTC, Andrei Alexandrescu wrote: https://issues.dlang.org/show_bug.cgi?id=15951. I showed a few obvious cases, but finding the best code in general is tricky. Ideas? -- Andrei In the first example you've forget to assign void: http://d.godbolt.org/#compilers:!((compiler:gdc46,options:'-O3',source:'struct+RCStr(C)%0A%7B%0Aenum+uint+maxSmall+%3D+64+/+C.sizeof+-+1%3B%0AC%5BmaxSmall%5D+small+%3D+void%3B%0Aubyte+smallLen+%3D+void%3B%0A%7D%0A%0Aauto+fun()+%7B%0A++RCStr!!char+result+%3D+void%3B%0A++return+result%3B%0A%7D%0A')),filterAsm:(commentOnly:!t,directives:!t,labels:!t),version:3 to get the real equivalent to the C++ version. The problem in the second example is that the full initializer is always copied from the static layout typeid(Stuff).initializer[0..$], despite of the fact that some of the members are not iitialized, while in C++ only members that are init are copied. The DMD code is shorter than the GDC one because it copies the initializer with `rep movsq` while GDC unrolls the `rep` with a bunch of `movq`, but in both cases the init is always fully copied. So the problem is the suboptimal copy of the initializer. I haven't seen but this problem is actually well known: https://issues.dlang.org/show_bug.cgi?id=11817 https://issues.dlang.org/show_bug.cgi?id=11331 It really looks that this can only be solved by a compiler update. Even with a custom this() with params and a the default this() disabled the initializer is **always** copied.
Re: Performance issue in struct initialization
On 4/23/16 10:10 AM, Basile B wrote: On Saturday, 23 April 2016 at 13:37:31 UTC, Andrei Alexandrescu wrote: https://issues.dlang.org/show_bug.cgi?id=15951. I showed a few obvious cases, but finding the best code in general is tricky. Ideas? -- Andrei In the first example you've forget to assign void: http://d.godbolt.org/#compilers:!((compiler:gdc46,options:'-O3',source:'struct+RCStr(C)%0A%7B%0Aenum+uint+maxSmall+%3D+64+/+C.sizeof+-+1%3B%0AC%5BmaxSmall%5D+small+%3D+void%3B%0Aubyte+smallLen+%3D+void%3B%0A%7D%0A%0Aauto+fun()+%7B%0A++RCStr!!char+result+%3D+void%3B%0A++return+result%3B%0A%7D%0A')),filterAsm:(commentOnly:!t,directives:!t,labels:!t),version:3 to get the real equivalent to the C++ version. Thanks. I'm not looking for 100% equivalence as much as defining a proper baseline in the C++ version. Asking the user to use "= void" is not acceptable. Andrei
Re: Performance issue in struct initialization
On Saturday, 23 April 2016 at 13:37:31 UTC, Andrei Alexandrescu wrote: https://issues.dlang.org/show_bug.cgi?id=15951. I showed a few obvious cases, but finding the best code in general is tricky. Ideas? -- Andrei In the first example you've forget to assign void: http://d.godbolt.org/#compilers:!((compiler:gdc46,options:'-O3',source:'struct+RCStr(C)%0A%7B%0Aenum+uint+maxSmall+%3D+64+/+C.sizeof+-+1%3B%0AC%5BmaxSmall%5D+small+%3D+void%3B%0Aubyte+smallLen+%3D+void%3B%0A%7D%0A%0Aauto+fun()+%7B%0A++RCStr!!char+result+%3D+void%3B%0A++return+result%3B%0A%7D%0A')),filterAsm:(commentOnly:!t,directives:!t,labels:!t),version:3 to get the real equivalent to the C++ version. The problem in the second example is that the full initializer is always copied from the static layout typeid(Stuff).initializer[0..$], despite of the fact that some of the members are not iitialized, while in C++ only members that are init are copied. The DMD code is shorter than the GDC one because it copies the initializer with `rep movsq` while GDC unrolls the `rep` with a bunch of `movq`, but in both cases the init is always fully copied. So the problem is the suboptimal copy of the initializer.
Re: Performance issue in struct initialization
On 24/04/2016 1:37 AM, Andrei Alexandrescu wrote: https://issues.dlang.org/show_bug.cgi?id=15951. I showed a few obvious cases, but finding the best code in general is tricky. Ideas? -- Andrei That opEquals that is generated looks awfully big as well. Also form what I've read you could get a speed boost if you lump the mov's together. E.g. movq , rax movq , rbx movq , rcx movq , rdx movq , r8 movq , r9 ... movq rax, movq rbx, movq rcx, movq rdx, movq r8, movq r9, In theory, in newer cpu's that should be fairly cheap compare to switching between. Of course that won't help anything in decreasing instruction count. Unfortunately I can't find anything backing this up so take it with a grain of salt, also rep mov might be just as good. Also there is a new series of CPU's coming out in a month or two, so who knows if it'll change assuming its valid now.
Performance issue in struct initialization
https://issues.dlang.org/show_bug.cgi?id=15951. I showed a few obvious cases, but finding the best code in general is tricky. Ideas? -- Andrei