Re: approxEqual() has fooled me for a long time...
On Thu, 21 Oct 2010 00:19:11 -0400, Andrei Alexandrescu wrote: On 10/20/10 16:33 CDT, Don wrote: Walter Bright wrote: Andrei Alexandrescu wrote: On 10/20/10 13:42 CDT, Walter Bright wrote: Don wrote: I'm personally pretty upset about the existence of that function at all. My very first contribution to D was a function for floating point approximate equality, which I called approxEqual. It gives equality in terms of number of bits. It gives correct results in all the tricky special cases. Unlike a naive relative equality test involving divisions, it doesn't fail for values near zero. (I _think_ that's the reason why people think you need an absolute equality test as well). And it's fast. No divisions, no poorly predictable branches. I totally agree that a precision based on the number of bits, not the magnitude, is the right approach. I wonder, could that be also generalized for zero? I.e., if a number is zero except for k bits in the mantissa. Zero is a special case I'm not sure how to deal with. It does generalize to zero. Denormals have the first k bits in the mantissa set to zero. feqrel automatically treats them as 'close to zero'. It just falls out of the maths. BTW if the processor has a "flush denormals to zero" mode, denormals will compare exactly equal to zero. So here's a plan of attack: 1. Keep feqrel. Clearly it's a useful primitive. vote++ 2. Find a more intuitive interface for feqrel, i.e. using decimal digits for precision etc. The definition in std.math: "the number of mantissa bits which are equal in x and y" loses 90% of the readership at the word "mantissa". You want something intuitive that people can immediately picture. 3. Define a good name for that When I think of floating point precision, I automatically think in ULP (units in last place). It is how the IEEE 754 specification specifies the precision of the basic math operators, in part because a given ULP requirement is applicable to any floating point type. And the ability for ulp(x,y) <= 2 to be meaningful for floats, doubles and reals is great for templates/generic programming. Essentially ulp(x,y) == min(x.mant_dig, y.mant_dig) - feqrel(x,y); On this subject, I remember that immediately after learning about the "==" operator I was instructed to never, ever use it for floating point values unless I knew for a fact one value had to be a copy of another. This of course leads to bad programming habits like: "In floating-point arithmetic, numbers, including many simple fractions, cannot be represented exactly, and it may be necessary to test for equality within a given tolerance. For example, rounding errors may mean that the comparison in a = 1/7 if a*7 = 1 then ... unexpectedly evaluates false. Typically this problem is handled by rewriting the comparison as if abs(a*7 - 1) < tolerance then ..., where tolerance is a suitably tiny number and abs is the absolute value function." - from Wikipedia's Comparison (computer programming) page. Since D has the "is" operator, does it make sense to actually 'fix' "==" to be fuzzy? Or perhaps to make the set of extended floating point comparison operators fuzzy?(i.e. "!<>" <=> ulp(x,y) <= 2) Or just add an approximately equal operator (i.e. "~~" or "=~") (since nobody will type the actual Unicode ≈) Perhaps even a tiered approach ("==" <=> ulp(x,y) <= 1, "~~" <=> ulp(x,y) <= 8).
Re: struct field alignment
On Sun, 17 Oct 2010 23:38:34 -0400, Walter Bright wrote: Robert Jacques wrote: Although I have a solution that works well for me, the one thing I lament about not having a canonical D way of expression align(8)/align(16), even at only a meta-information level, is that if phobos gets a small vector library, I can't use it and conversely I'm not motivated to improve/submit my own small vector library to phobos. I'm painfully aware that align(8)/(16) don't work on the 32 bit targets. I've been reluctant to fix that because it involves some performance degradation (keeping the stack so aligned requires the insertion of stack adjustment instructions here and there). With the 64 bit target, however, the C ABI will force the issue. It'll support those alignments. Cool. I understand the performance issue, but would it be possible for the internal alignment of a struct to be correct? i.e. struct A { float x; float2 point; } would be properly aligned internally: struct A { float x; int padding; float2 point; }. That way, 32-bit programs could read binary files from 64-bit programs and vice-versa.
Re: Possible bug in atomicOp
On Sat, 23 Oct 2010 15:50:30 -0400, Sean Kelly wrote: Benjamin Thaut wrote: Am 23.10.2010 14:52, schrieb dsimcha: == Quote from Benjamin Thaut (c...@benjamin-thaut.de)'s article The following testcase (when executed on dual core at least) results > > in a endless loop inside atomicOp. import std.stdio; import std.concurrency; enum Messages { GO, END } shared class Account { private double amount = 0; double getAmount() const { return amount; } void change(double change){ atomicOp!"+="(amount,change); } } shared Account bank = null; void otherThread(Tid father){ send(father,Messages.GO); for(int i=0;i<1000;i++) bank.change(-100); send(father,Messages.END); } void main(string[] args) { bank = new Account(); spawn(&otherThread,thisTid); receiveOnly!(Messages)(); for(int i=0;i<1000;i++) bank.change(+100); receiveOnly!(Messages)(); writefln("Program finished. Amount is %s",bank.getAmount()); } Is this a bug, or am I doing something wrong here? If it is a bug it is kind of critical because people which are > > reading the "the D programming language" book won't be happy to find out > > that some of the given examples do not work yet. http://d.puremagic.com/issues/show_bug.cgi?id=4782 Basically, atomicLoad (which atomicOp uses) always returns in ALU > registers. Floating point numbers need to be returned in floating point > registers. Therefore, a NaN always gets returned from atomicLoad!double, and a > NaN isn't equal to anything. So shouldn't there be a static assert to prevent one from using atomicOp with floats and doubles? Or should atomicLoad be implemented to support floats and doubles? The former in the short term and the latter in the long term. Well, here's the assembler to load a value onto the FP stack: float __int2float (ref int x) { asm { fld float ptr [EAX]; } } double __long2double(ref long x) { asm { fld double ptr [EAX]; } } Unfortunately, direct loading from a register doesn't seem to be supported. So writing wrapper code which uses a union would be almost as fast. I know there is an SSE instruction to load directly from registers, but we can't assume SSE support.
Re: Possible bug in atomicOp
On Mon, 25 Oct 2010 00:08:03 -0400, Don wrote: Robert Jacques wrote: On Sat, 23 Oct 2010 15:50:30 -0400, Sean Kelly wrote: Basically, atomicLoad (which atomicOp uses) always returns in ALU > registers. Floating point numbers need to be returned in floating point > registers. Therefore, a NaN always gets returned from atomicLoad!double, and a > NaN isn't equal to anything. So shouldn't there be a static assert to prevent one from using atomicOp with floats and doubles? Or should atomicLoad be implemented to support floats and doubles? The former in the short term and the latter in the long term. Well, here's the assembler to load a value onto the FP stack: float __int2float (ref int x) { asm { fld float ptr [EAX]; } } double __long2double(ref long x) { asm { fld double ptr [EAX]; } } That should be: fild dword ptr [EAX]; fild qword ptr [EAX]; Opps. I should have named them __int_as_float and __long_as_double. The point was to find an efficient way to return an int or double in register on the x87 stack.
Re: Proposal: Relax rules for 'pure'
On Mon, 25 Oct 2010 09:44:14 -0400, Bruno Medeiros wrote: On 23/09/2010 23:39, Robert Jacques wrote: On Thu, 23 Sep 2010 16:35:23 -0400, Tomek Sowiński wrote: On topic: this means a pure function can take a reference to data that can be mutated by someone else. So we're giving up on the "can parallelize with no dataraces" guarantee on pure functions? In short, No. In long; the proposal is for pure functions become broken up into two groups (weak and strong) based on their function signatures. This division is internal to the compiler, and isn't expressed in the language in any way. Strongly-pure functions provide all the guarantees that pure does today and can be automatically parallelized or cached without consequence. Weakly-pure functions, don't provide either of these guarantees, but allow a much larger number of functions to be strongly-pure. In order to guarantee a function is strongly pure, one would have to declare all its inputs immutable or use an appropriate template constraint. I think we need to be more realistic with what kinds of optimizations we could expect from a D compiler and pure functions. Caching might be done, but only a temporary sense (caching under a limited execution scope). I doubt we would ever have something like memoization, which would incur memory costs (potentially quite big ones), and so the compiler almost certainly would not be able to know (without additional metadata/anotations or compile options) if that trade-off is acceptable. Similarly for parallelism, how would the compiler know that it's ok to spawn 10 or 100 new threads to parallelize the execution of some loop? The consequences for program and whole-machine scheduling would not be trivial and easy to understand. For this to happen, amongst other things the compiler and OS would need to ensure that the spawned threads would not starve the rest of the threads of that program. I suspect all these considerations might be very difficult to guarantee on a non-VM environment. Ahem, it's trivial for the compiler to know if it's okay to spawn 10 or 100 _tasks_. Tasks, as opposed to threads or even thread pools, are extremely cheap (think on the order of function call overhead).
Re: Proposal: Relax rules for 'pure'
On Thu, 28 Oct 2010 10:48:34 -0400, Bruno Medeiros wrote: On 26/10/2010 04:47, Robert Jacques wrote: On Mon, 25 Oct 2010 09:44:14 -0400, Bruno Medeiros wrote: On 23/09/2010 23:39, Robert Jacques wrote: On Thu, 23 Sep 2010 16:35:23 -0400, Tomek Sowiński wrote: On topic: this means a pure function can take a reference to data that can be mutated by someone else. So we're giving up on the "can parallelize with no dataraces" guarantee on pure functions? In short, No. In long; the proposal is for pure functions become broken up into two groups (weak and strong) based on their function signatures. This division is internal to the compiler, and isn't expressed in the language in any way. Strongly-pure functions provide all the guarantees that pure does today and can be automatically parallelized or cached without consequence. Weakly-pure functions, don't provide either of these guarantees, but allow a much larger number of functions to be strongly-pure. In order to guarantee a function is strongly pure, one would have to declare all its inputs immutable or use an appropriate template constraint. I think we need to be more realistic with what kinds of optimizations we could expect from a D compiler and pure functions. Caching might be done, but only a temporary sense (caching under a limited execution scope). I doubt we would ever have something like memoization, which would incur memory costs (potentially quite big ones), and so the compiler almost certainly would not be able to know (without additional metadata/anotations or compile options) if that trade-off is acceptable. Similarly for parallelism, how would the compiler know that it's ok to spawn 10 or 100 new threads to parallelize the execution of some loop? The consequences for program and whole-machine scheduling would not be trivial and easy to understand. For this to happen, amongst other things the compiler and OS would need to ensure that the spawned threads would not starve the rest of the threads of that program. I suspect all these considerations might be very difficult to guarantee on a non-VM environment. Ahem, it's trivial for the compiler to know if it's okay to spawn 10 or 100 _tasks_. Tasks, as opposed to threads or even thread pools, are extremely cheap (think on the order of function call overhead). What are these tasks you mention? I've never heard of them. The programming language Cilk popularized the concept of parallelization through many small tasks combined with a work stealing runtime. Futures are essentially the same concept, but because futures were generally implemented with OS-threads, a thread pool or fibers/coroutines, that term is generally avoided. Like message passing, tasks are often implemented in libraries with Intel's threading building blocks probably being the most famous, though both Microsoft's Task Parallel Library and Apple's Grand Central are gaining mind-share. David Simcha currently has a task library in review for inclusion to phobos. Basically, the point of tasks is to provide parallelization with extremely low overhead (on average a Cilk spawn is less than 4 function calls). That way, instead of having a few coarse grain threads which neither scale nor load balance well, you're encouraged to use tasks everywhere and therefore reap the benefits of a balanced N-way scalable system. Getting back to pure, one of the "big" advantages of functional languages are their ability to automatically parallelize themselves; and they use a work stealing runtime (aka tasks) to do this.
Re: Proposal: Relax rules for 'pure'
On Fri, 29 Oct 2010 07:13:46 -0400, tls wrote: Robert Jacques Wrote: Getting back to pure, one of the "big" advantages of functional languages are their ability to automatically parallelize themselves; and they use a work stealing runtime (aka tasks) to do this. What functional has task? I switch in Meantime when D2 not staple to study this. Every functional language can do this; it's part and parcel of being functional. It's really a more question of how mature their runtimes are, as opposed to language support.
Re: Proposal: Relax rules for 'pure'
On Mon, 01 Nov 2010 10:24:43 -0400, Bruno Medeiros wrote: On 29/10/2010 02:32, Robert Jacques wrote: [snip] The programming language Cilk popularized the concept of parallelization through many small tasks combined with a work stealing runtime. Futures are essentially the same concept, but because futures were generally implemented with OS-threads, a thread pool or fibers/coroutines, that term is generally avoided. Like message passing, tasks are often implemented in libraries with Intel's threading building blocks probably being the most famous, though both Microsoft's Task Parallel Library and Apple's Grand Central are gaining mind-share. David Simcha currently has a task library in review for inclusion to phobos. Basically, the point of tasks is to provide parallelization with extremely low overhead (on average a Cilk spawn is less than 4 function calls). That way, instead of having a few coarse grain threads which neither scale nor load balance well, you're encouraged to use tasks everywhere and therefore reap the benefits of a balanced N-way scalable system. Hum, I see what you mean know, but tasks only help with the *creation overhead* of otherwise spawning lots of OS threads, they don't solve the main problems I mentioned. First, it may be fine to spawn 100 tasks, but there is still the issue of deciding how many OS threads the tasks will run in! Obviously, you won't run them in just one OS thread, otherwise you won't get any parallelism. Ideally for your program only, your program would have as many OS threads as there are cores. But here there is still the same issue of whether its ok for you program to use up all the cores in your machine. The compiler doesn't know that. Could it be enough to have a global compiler option to specify that? I don't think so: What if you want some code of your program to use as much OS-threads as possible, but not some other code? Second, and perhaps more importantly, the very same issue occurs in the scope of your program alone. So, even if you use all OS threads, and don't care about other programs, spawning 100 tasks for some loop might take time away from other more important tasks of your program. The compiler/task-scheduler/whatever would not automatically know what is acceptable and what is not. (the only exception being if your program was logically single-threaded) Controlling the task runtime thread-pool size is trivial. Indeed, you'll often want to reduce the number of daemon threads by the number of active program threads. And if you need fine grain control over pool sizes, you can always create separate pools and assign tasks to them. I think a reasonable default would be (# of cores - 2) daemons with automatic decreases/increases with every spawn/termination. But, all those settings should be controllable at runtime.
Re: Overzealous recursive template expansion protection?
On Tue, 02 Nov 2010 22:03:47 -0400, Gareth Charnock wrote: I've been trying to correctly implement the interpreter patten/expression templates in D (for reference this is a summary of the C++ interpreter patten can be found here http://www.drdobbs.com/184401627). I've run into a problem and I'm not sure if it's a compiler bug or not. The testcase is: struct BinaryOp(L,string op,R) { pragma(msg,"Instansiating " ~ typeof(this).stringof); BinaryOp!(typeof(this),s,R1) opBinary(string s,R1)(R1 r) { pragma(msg,"Instansiating BinaryOp.opBinary ~L.stringof ~ op ~ R1.stringof); return typeof(return)(); } } struct Leaf { BinaryOp!(typeof(this),s,R) opBinary(string s,R)(R r) { pragma(msg,"Instansiating leaf.opBinary(" ~ R.stringof ~ ")"); return typeof(return)(); } }; void main() { Leaf v1,v2,v3; pragma(msg,""); pragma(msg,"=== This Compiles =="); v1*(v2*v3); pragma(msg,""); pragma(msg,"=== This Doesn't =="); (v1*v2)*v3; } Output: === This Compiles == Instansiating BinaryOp!(Leaf,s,Leaf) Instansiating leaf.opBinary(Leaf) Instansiating BinaryOp!(Leaf,s,BinaryOp!(Leaf,s,Leaf)) Instansiating leaf.opBinary(BinaryOp!(Leaf,s,Leaf)) === This Doesn't == Error: recursive template expansion for template argument BinaryOp!(Leaf,s,Leaf) I've tracked the problem down to the return type of BinaryOp.opBinary. Clearly putting BinaryOp!(typeof(this),...) would be a Bad Thing in the main template body but opBinary is a template that may or may not be instantiated so it shouldn't automatically lead to runaway instantiation. It seems the compiler is a little bit overzealous in making sure that such runaway instantiations do not happen. Is this a bug? Should I file it? Here's what I think a minimal test case might look like: struct A(T1) { void templateFunc(T2)(T2 a) { alias A!(typeof(this)) error; } } void main() { A!int a; a.templateFunc!int(0); } I'm going to lean on the side of this being a compiler bug (so please file), as there are multiple workarounds without logically changing anythings Here's one: struct BinaryOp(alias L,string op, R) { pragma(msg,"Instansiating ", typeof(this).stringof); BinaryOp!(BinaryOp,s~"",R1) opBinary(string s, R1)(R1 r) { pragma(msg,"Instansiating BinaryOp.opBinary", L.stringof, " ", op," ",R1.stringof); return typeof(return)(); } } struct Leaf { BinaryOp!(Leaf,s~"",R) opBinary(string s,R)(R r) { pragma(msg,"Instansiating leaf.opBinary(", R.stringof, ")"); return typeof(return)(); } }; And here's another struct BinaryOp(L,string op, R) { pragma(msg,"Instansiating ", typeof(this).stringof); BinaryOp!(BinaryOp,s,R1) opBinary(string s, R1)(R1 r) { pragma(msg,"Instansiating BinaryOp.opBinary", L.stringof, " ", op," ",R1.stringof); return typeof(return)(); } } struct Leaf { BinaryOp!(Leaf,s~"",R) opBinary(string s,R)(R r) { pragma(msg,"Instansiating leaf.opBinary(", R.stringof, ")"); return typeof(return)(); } }; In general, when passing template value parameters to another template, I'd recommend performing a no-op on them (i.e. ~"" or +0), since sometimes they're passed as N or op instead of 10 or "+". Also, your can use the template name inside it to refer to that instance's type (i.e. you don't have to use typeof(this)).
Re: Why is 'scope' so weak?
On Tue, 23 Nov 2010 07:59:27 -0500, Lars T. Kyllingstad wrote: If I've understood things correctly, by marking a delegate parameter with 'scope' you tell the compiler not to create a true closure for the delegate. Effectively you're saying "I promise not to escape this delegate, so you don't need to copy its context to the heap". In brief, my question is: Why doesn't the compiler enforce this promise? In particular, why is 'scope' not a type constructor? For scope to be a type constructor, D requires some form of ownership-types & local escape analysis. Just like mutable and immutable data needs const, I think stack/thread-local/shared data needs scope. (There is an old proposal on the wiki about the technical implementation, though it's badly worded) But my understanding is that all things ownership related are relegated to D3.
Re: String compare performance
On Sat, 27 Nov 2010 22:08:29 -0500, bearophile wrote: I have done another test: Timings, dmd compiler, best of 4, seconds: D #1: 5.72 D #4: 1.84 D #5: 1.73 Psy: 1.59 D #2: 0.55 D #6: 0.47 D #3: 0.34 import std.file: read; import std.c.stdio: printf; int test(char[] data) { int count; foreach (i; 0 .. data.length - 3) { char[] codon = data[i .. i + 3]; if ((codon.length == 3 && codon[0] == 'T' && codon[1] == 'A' && codon[2] == 'G') || (codon.length == 3 && codon[0] == 'T' && codon[1] == 'G' && codon[2] == 'A') || (codon.length == 3 && codon[0] == 'T' && codon[1] == 'A' && codon[2] == 'A')) count++; } return count; } void main() { char[] data0 = cast(char[])read("data.txt"); int n = 300; char[] data = new char[data0.length * n]; for (size_t pos; pos < data.length; pos += data0.length) data[pos .. pos+data0.length] = data0; printf("%d\n", test(data)); } So when there is to compare among strings known at compile-time to be small (like < 6 char), the comparison shall be replaced with inlined single char comparisons. This makes the code longer so it increases code cache pressure, but seeing how much slow the alternative is, I think it's an improvement. (A smart compiler is even able to remove the codon.length==3 test because the slice data[i..i+3] is always of length 3 (here mysteriously if you remove those three length tests the program compiled with dmd gets slower)). Bye, bearophile Hi bearophile, I've spent some time having fun this afternoon optimizing array-equals using vectorization techniques. I found that vectorizing using ulongs worked best on my system except with the shortest strings, where a simple Duff's device edged it out. If you'd like to try it out on your data set: bool arrayComp3(bool useBitCompare = true,T)(const T[] a, const T[] b) pure nothrow { if(a.length != b.length) return false; static if(useBitCompare) { auto pab = cast(ubyte*)a.ptr; auto pbb = cast(ubyte*)b.ptr; if(pab is pbb) return true; auto byte_length = a.length*T.sizeof; auto pa_end = cast(ulong*)(pab+byte_length); final switch (byte_length % ulong.sizeof) { case 7: if(*pab++ != *pbb++ ) return false; case 6: if(*pab++ != *pbb++ ) return false; case 5: if(*pab++ != *pbb++ ) return false; case 4: if(*pab++ != *pbb++ ) return false; case 3: if(*pab++ != *pbb++ ) return false; case 2: if(*pab++ != *pbb++ ) return false; case 1: if(*pab++ != *pbb++ ) return false; case 0: } auto pa = cast(ulong*)pab; auto pb = cast(ulong*)pbb; while (pa < pa_end) { if(*pa++ != *pb++ ) return false; } } else {// default to a short duff's device auto pa = a.ptr; auto pb = b.ptr; if(pa is pb) return true; auto n = (a.length + 3) / 4; final switch (a.length % 4) { case 0:do { if(*pa++ != *pb++ ) return false; case 3: if(*pa++ != *pb++ ) return false; case 2: if(*pa++ != *pb++ ) return false; case 1: if(*pa++ != *pb++ ) return false; } while (--n > 0); }} return true; }
Re: String compare performance
On Sun, 28 Nov 2010 06:44:23 -0500, bearophile wrote: Robert Jacques: I've spent some time having fun this afternoon optimizing array-equals using vectorization techniques. I found that vectorizing using ulongs worked best on my system except with the shortest strings, where a simple Duff's device edged it out. If you'd like to try it out on your data set: Thank you for your work :-) A version with your function, D version #8: [snip] Your function can't be inlined because it's big, so this code isn't faster than inlined code like this generated by the compiler: (codon.length == 3 && codon[0] == 'T' && codon[1] == 'A' && codon[2] == 'G') Bye, bearophile Still, part of the point was that string comparisons in general were way too slow. Anyways, I've applied the same technique in a partially unrolled version if you want to check it out: bool arrayComp(T, size_t N)(const T[] a, ref const T[N] b) pure nothrow { static if(T.sizeof*N <= uint.sizeof) { return a.length == N && !( (*(cast(uint*)a.ptr) ^ *(cast(uint*)b.ptr)) & (uint.max >> 8*(uint.sizeof - T.sizeof*N) )); } else static if(T.sizeof*N <= ulong.sizeof) { return a.length == N && !( (*(cast(ulong*)a.ptr) ^ *(cast(ulong*)b.ptr)) & (ulong.max>> 8*(ulong.sizeof - T.sizeof*N) )); } else { // Fall back to a loop if(a.length != N || (*(cast(ulong*)a.ptr) != *(cast(ulong*)b.ptr)) ) return false; enum alignment = T.sizeof*N % ulong.sizeof > 0 ? T.sizeof*N % ulong.sizeof : ulong.sizeof; auto pa = cast(ulong*)(cast(ubyte*)a.ptr + alignment); auto pb = cast(ulong*)(cast(ubyte*)b.ptr + alignment); auto pa_end = cast(ulong*)(cast(ubyte*)a.ptr + T.sizeof*N); while (pa < pa_end) { if(*pa++ != *pb++ ) return false; } return true; } } Be warned that you'll have to use strings explicitly typed as immutable char[N], since both array literals and string literals won't match a this template.
Re: String compare performance
On Sun, 28 Nov 2010 20:32:24 -0500, Stewart Gordon wrote: On 27/11/2010 23:04, Kagamin wrote: bearophile Wrote: Also, is there a way to bit-compare given memory areas at much higher speed than element per element (I mean for arrays in general)? I don't know. I think you can't. You can use memcmp, though only for utf-8 strings. Only for utf-8 strings? Why's that? I would've thought memcmp to be type agnostic. Stewart. memcmp is type agnostic if all you want to compare is equality. The other use of memcmp is essentially as an opCmp, in which case it would be type sensitive.
Re: D's greatest mistakes
On Mon, 29 Nov 2010 16:43:01 -0500, Andrei Alexandrescu wrote: On 11/29/10 3:25 PM, Manfred_Nowak wrote: Daniel Gibson wrote: "with" shouldn't be much of a problem anymore. import std.stdio; struct X{ int a, b, c;} struct Y{ int a, b;} void main(){ X x; Y y; with( x){ c= 2; with( y){ c= 1; } } writeln( x.c); } Do you see the not "much of a problem"? -manfred Great. Could you please submit this as a bug report? It looks like it's actually an old bug: http://d.puremagic.com/issues/show_bug.cgi?id=1849
Re: D's greatest mistakes
On Tue, 30 Nov 2010 14:36:58 -0500, Walter Bright wrote: Denis Koroskin wrote: I'd go with omissible parens. There are SO many bugs and stuff that can't be worked around due to this. For function calls? Yeah, I fell into that one. Oops. I should have learned my lesson with the problems that C has with implicitly taking the function's address. D's omissible parenthesis strike me as being half-way between C#'s properties and Eiffel's Uniform Access Principle. Given Eiffel's influence on D, was there a reason why you didn't implement the uniform access principal instead of omissible parenthesis?
Re: D's greatest mistakes
On Tue, 30 Nov 2010 23:11:32 -0500, Walter Bright wrote: Robert Jacques wrote: D's omissible parenthesis strike me as being half-way between C#'s properties and Eiffel's Uniform Access Principle. Given Eiffel's influence on D, was there a reason why you didn't implement the uniform access principal instead of omissible parenthesis? I haven't studied Eiffel that much, and remember that D came out at the same time as C#, not after it. I thought omissible parenthesis were a late addition to the language (cira 2005/2006)
Re: Verbose checking of range category
On Sat, 11 Dec 2010 12:15:31 -0500, Andrei Alexandrescu wrote: [snip] This program will generate a valid executable, but will also print during compilation: Type int is not a random access range because: no empty property no front property no popFront method no indexing no slicing When a programmer has an odd issue with a range check, turning verboseness of checks could help. What do you think? Andrei An issue with this is that failed template type checks are both extremely common and expected in template code: i.e. static if(isRandomAccessRange!T) {...}. So, you'll get a lot of spurious error messages slowly. Add in parallel builds, and the last error message won't even be the one you're looking for. So, while I think the concept is usefully, I'd view this implementation as an intermediate stepping stone at best (and a distraction from you fixing other bugs at worse). I'd recommend, as an alternative, to have specific debug versions of the checks, i.e. isRandomAccessRangeDebug, as a way to avoid false positives.
Re: fast string searching
On Sun, 12 Dec 2010 23:11:03 -0500, Andrei Alexandrescu wrote: This looks promising. We could adopt it for Phobos. http://www.reddit.com/r/programming/comments/ekfct/efficient_substring_search_not_a_boyermoore/ The code has no license though... Andrei Have you e-mailed the author? Leonid Volnitsky leo...@volnitsky.com
Re: write, toString, formatValue & range interface
On Tue, 14 Dec 2010 05:02:41 -0500, spir wrote: Hello, Had a nice time degugging an issue after having added an input range interface to a (big) struct type. Finally managed to deduce the problem happens when writing out an element of the struct type. This introduced an infinite loop ending in segfault. Found it weird because the struct's toString does not iterate over the type, so there was no reason to use the range interface. This is why I guessed toString was not called. And in fact, forcing its use by explicitely calling .toString() solved the bug! 2 correspondants (Stephan Mueller & Ivan Melnychuk) helped me by pointing to the various template-selection criteria of formatValue. There seems to be a pair of bugs in the set of formatValue templates constaints, which cause the following problems: * If a class defines both toString and a range interface, compiler error (additional bug pointed by Stephan Mueller). * For structs, the presence of a range interface shortcuts toString. * If a range outputs elements of the same type, writing (and probably other features) runs into an infinite loop. This case is unchecked yet. The following changes may, I guess, solve the first two problems: (1) structs added (with classes) to the template selecting the use of toString (2) the template that selects the use of ranges checks there is no toString (3) the special case of using t.stringof for stucts must be selected only in last resort -- actually, this case may be suppressed and integrated into the general class/struct case. This means changing the following formatValue templates (quickly written, absolutely untested ;-): // case use toString (or struct .stringof): add structs void formatValue(Writer, T, Char)(Writer w, T val, ref FormatSpec!Char f) if (is(T == class) || is(T == struct)) { // in case of struct, detect whether toString is defined, else use T.stringof } // case use range interface: check no toString available // also add a test that the range does not output elements of the same type!!! void formatValue(Writer, T, Char)(Writer w, T val, ref FormatSpec!Char f) if ( isInputRange!T && !isSomeChar!(ElementType!T) || ! is(typeof(val.toString() == string)) ) {...} // special case use T.stringof for struct: useless? (else check no toString) void formatValue(Writer, T, Char)(Writer w, T val, ref FormatSpec!Char f) if ( is(T == struct) && !isInputRange!T && ! is(typeof(val.toString() == string)) ) { put(w, T.stringof); } Also, (1) the online doc of std.format seems outdated, no constraint for instance (2) the in-source doc is rather confusing, several comments do not describe the following code. Hope this helps, Denis -- -- -- -- -- -- -- vit esse estrany ☣ spir.wikidot.com Having recently run into this without knowing it, vote++. Also, please file a bug report (or two).
Re: Using unary expressions with property functions
On Mon, 13 Dec 2010 14:29:34 -0500, Jonathan M Davis wrote: On Monday, December 13, 2010 11:10:55 Andrej Mitrovic wrote: So I was a bit surprised today to find out that this sample C# code works: class LibraryItem { private int _numCopies; // Property public int NumCopies { get { return _numCopies; } set { _numCopies = value; } } public void BorrowItem(string name) { NumCopies--; } } Apparently you can use the -- and ++ unary ops on property functions, which might be convenient, I guess. But this won't work in D: class LibraryItem { private int _numCopies; @property int NumCopies() { return _numCopies; } @property void NumCopies(int value) { _numCopies = value; } public void BorrowItem(string name) { NumCopies--; } } void main() { } I get this back: Error: this.NumCopies() is not an lvalue So D tries to call the getter method "int NumCopies()" and apply -- on it. One way around this is to make the getter method return by reference: import std.stdio; class LibraryItem { private int _numCopies; @property ref int NumCopies() { return _numCopies; } @property void NumCopies(int value) { _numCopies = value; } public void BorrowItem(string name) { NumCopies--; writeln(NumCopies); // writes -1 after the call } } void main() { auto lib = new LibraryItem(); lib.BorrowItem("TDPL"); } So this works. But now the getter method acts both as a getter and a setter, which contradicts the idea of having separate get/set methods, don't you agree? I'm not sure how C# works. Does it use the getter method and return a reference for the -- op like in the last D example, or does it rewrite the whole operation as "NumCopies(-1)"? Maybe we can have a functionality like that in D without having to change the getter method to return a reference.. What do you think? I would think that it would be doable if something ++lib.NumCopies were lowered to lib.NumCopies(++lib.NumCopies()). If it were post-increment, it would then probably turn it into something more like this: auto toInc = lib.NumCopies(); auto temp = toInc; lib.NumCopies(++toInc); Of course, given that you could use pre and post-increment in the middle of an expression, that could get quite a bit more complicated, and a simple lowering could be rather difficult, but the basic idea should still hold. I don't see why it couldn't be done other than that figuring out the exact way to lower it in order to deal with the fact that it could be in the middle of an expression could be a bit entertaining. I say open an enhancement request for it. Perhaps someone else can point out why it would be a bad idea, but on the surface, it seems quite doable. - Jonathan M Davis I know this concept has been extensively discussed before, i.e. the problem of a.b.c.d = 1, when b, c or d happen to be a method/property.
Re: write, toString, formatValue & range interface
On Wed, 15 Dec 2010 04:48:43 -0500, spir wrote: On Tue, 14 Dec 2010 09:35:20 -0500 "Robert Jacques" wrote: Having recently run into this without knowing it, Which one? (This issue causes 3 distinct bugs.) A struct + opDispatch combination resulted in a compile time error for me. Of course, since the opDispatch routine returned itself, I figure an infinite loop would have occurred if it did compile. Right now, I've added template constraints, but it is a suboptimal solution to the problem.
Re: type classes for selection of template variant
On Wed, 15 Dec 2010 07:58:36 -0500, spir wrote: Hello, see http://en.wikipedia.org/wiki/Type_class I have tried several times to express the idea of using intervaces (only) for template case selection. Instead of a combination of really penible & hardly readible is() expressions and adhoc functions like isRandomAccessRange. (No chance in these trials ;-) Just discovered this idea is actually the same as 'type classes' introduced in Haskell (not that I'm a fan of functional programming, just stepped on the article by chance). D does not need any new language feature: interfaces are perfect for that. Maybe the syntax could be slightly extended to allow expressing negative constraints, like the following (defining an interface for types that support '==', but not comp) interface EqualityOnly : Equality !: Comparable {} Denis -- -- -- -- -- -- -- vit esse estrany ☣ spir.wikidot.com No new syntax is required: templates can already test whether two types are similar in a 'duck' sense today, thanks to compile time reflection.
Re: JSON (was: emscripten )
On Wed, 15 Dec 2010 09:38:16 -0500, Adam D. Ruppe wrote: Vladimir Panteleev Wrote: if (resource == "/getfoo") { struct FooResult { int n; string s; } return toJSON(FooResult(42, "bar")); // {"n":42,"s":"bar"} } What kind of code did you use there? My app does something similar using wrappers of std.json. JSONValue toJsonValue(T)(T a) { JSONValue val; static if(is(T == JSONValue)) { val = a; } else static if(__traits(compiles, a.makeJsonValue())) { val = a.makeJsonValue(); } else static if(isIntegral!(T)) { val.type = JSON_TYPE.INTEGER; val.integer = to!long(a); [..] And it goes right through a variety of types, including structs where it does __traits(allMembers), and ultimately settles for to!string if nothing else fits. My program also reads json, but I had some trouble with std.json, so I had to fork it there. It has a helper function jsonValueToVariant (which just became fully usable in dmd 2.050, shortening my code a lot, thanks phobos/dmd devs!) which pulls it into a std.variant for easy using later. The trouble I had was std.json.parseJSON claims to be able to handle arbitrary input ranges, but when I actually instantiated it on plain old string, it refused to compile. I made it work by switching it to just normal strings in my private fork. What's really cool about these templates is it enables automatic calling of functions from the outside. You write a function like: struct User { ... } User getUserInfo(int id) { } And then this is accessible, through template magic, as: /app/get-user-info?id=1 Returns a full HTML document with the info /app/get-user-info?id=1&format=json Returns the User struct converted to json /app/get-user-info?id=1&format=xml The struct as a kind of xml (I didn't spend much time on this so it still sucks) And more. Way cool. Anyway, I thought about committing some of my json changes back to std.json, but removing the range capability goes against the grain there, and adding the templates seems pointless since everyone says std.json is going to be trashed anyway. I thought I might have actually been the only one using it! I'm curious what you did in your code. Is it a custom module or did you build off the std.json too? Hi Adam, I've been working on a replacement for std.json. I posted a preview to the phobos list, but I haven't gotten any feedback yet, as its not very high priority. Here is my original post: I have been working on a re-write of std.json. The purpose was to fix implementation bugs, better conform to the spec, provide a lightweight tokenizer (Sean) and to use an Algebraic type (Andrei) for JSON values. In the progress of doing this, I made my parser 2x faster and updated/fixed a bunch of issues with VariantN in order to fully support Algebraic types. Both of these libraries are at a solid beta level, so I'd like to get some feedback, and provide a patch for those being held back by the problems with Algebraic. The code and docs are available at: https://jshare.johnshopkins.edu/rjacque2/public_html/. These files were written against DMD 2.050 and both depend on some patches currently in bugzilla (see the top of each file or below) Summary of Variant changes: * Depends on Issue 5155's patch * VariantN now properly supports types defined using "This". * Additional template constraints and acceptance of implicit converters in opAssign and ctor. i.e. if an Algebraic type supports reals, you can now assign an int to it. * Updated to using opBinary/opBinaryRight/opOpAssign. This adds right support to several functions and is now generated via compile time reflection + mixins: i.e. Algebraic types of user defined types should work, etc. * Added opIn support, though it currently on works for AAs. * Added opIndexOpAssign support. * Added opDispatch as an alternative indexing method. This allows Variants of type Variant[string] to behave like prototype structs: i.e. var.x = 5; instead of var["x"] = 5; Notes: * There's an bugzilla issue requesting opCall support in Variant. While I can see the usefulness, syntactically this clashes with the ctor. Should this issue be closed or should a method be used as an opCall surrogate? * Could someone explain to me the meaning/intension of "Future additions to Algebraic will allow compile-time checking that all possible types are handled by user code, eliminating a large class of errors." Is this something akin to final switch support? Summary of JSON changes: * Depends on the Variant improvements. * Depends on Issue 5233's patch * Depends on Issue 5236's patch * Issue 5232's patch is also recommended * The integer type was removed: JSON doesn't differentiate between floating and integral numbers. Internally, reals are used and on systems with 80-bit support, this encompasses all integral types. * UTF escape cha
Re: JSON (was: emscripten )
On Wed, 15 Dec 2010 13:31:37 -0500, Adam D. Ruppe wrote: Robert Jacques wrote: *snip code* I'm just quickly looking it over, with some brief comments. Your code looks good. You covered everything, and your use of tupleof seems to do a better job than my own use of allMembers! Very cool. I guess I'm actually very brief, but generally it looks nice. I'll try using it in a project sometime (hopefully soon) and may have some meaty comments then, but from the source, I think I'll like it. Thanks. I also used allMembers in my first iteration, but it gives what I'd consider false positives, so you have to filter the results.
Re: JSON
On Wed, 15 Dec 2010 13:42:39 -0500, Andrei Alexandrescu wrote: On 12/15/10 11:49 AM, Robert Jacques wrote: [Algebraic] Are all bugfixes that your work depends on rolled into the upcoming dmd release? Andrei I don't think so, each bug seems to still be open and I haven't seen any posts to the phobos list mentioning them. The bug reports do all contain patches though. Issue 5155: http://d.puremagic.com/issues/show_bug.cgi?id=5155 Issue 5233: http://d.puremagic.com/issues/show_bug.cgi?id=5233 Issue 5236: http://d.puremagic.com/issues/show_bug.cgi?id=5236 Issue 5232: http://d.puremagic.com/issues/show_bug.cgi?id=5232
Re: Cross-post from druntime: Mixing GC and non-GC in D. (AKA, "don't touch GC-references from DTOR, preferably don't use DTOR at all")
On Wed, 15 Dec 2010 16:23:24 -0500, Ulrik Mikaelsson wrote: Cross-posting after request on the druntime list: -- Hi, DISCLAIMER: I'm developing for D1/Tango. It is possible these issues are already resolved for D2/druntime. If so, I've failed to find any information about it, please do tell. Recently, I've been trying to optimize my application by swapping out some resource allocation (file-descriptors for one) to reference-counted allocation instead of GC. I've hit some problems. Problem === Basically, the core of all my problems is something expressed in http://bartoszmilewski.wordpress.com/2009/08/19/the-anatomy-of-reference-counting/ as "An object’s destructor must not access any garbage-collected objects embedded in it.". [snip] Having run into this problem with CUDA C language bindings, I do feel your pain. However, the fact that "An object’s destructor must not access any garbage-collected objects embedded in it." is a key assumption made by all GC algorithms (that I know of). Yes, D's current GC only does full collections, so a child-object knows that it's parent objects are either valid or are being collected at the same time it is. But this isn't true for generational collectors, and I wouldn't want D to exclude itself from a wide range of modern GC.
Re: Threads and static initialization.
On Sat, 18 Dec 2010 03:27:22 -0700, Pelle Månsson wrote: On 12/18/2010 10:00 AM, Jonathan M Davis wrote: The problem is that the OP wants the static constructors to be skipped. If they're skipped, anything and everything which could be affected by that can't be used. That pretty much means not using TLS, since the compiler isn't going to be able to track down which variables in TLS will or won't be affected by it. So, you're stuck using shared memory only. _That_ is where the problem comes in. Exactly, not using TLS. You can still use the heap, as it is not thread local. Meaning you can create non-shared anything all you like, as long as you're not using TLS. Except that the 'heap' internally uses TLS. The GC does need and use TLS.
Re: Threads and static initialization.
On Sat, 18 Dec 2010 15:04:38 -0700, Michel Fortin wrote: On 2010-12-18 15:57:50 -0500, "Robert Jacques" said: On Sat, 18 Dec 2010 03:27:22 -0700, Pelle Månsson wrote: On 12/18/2010 10:00 AM, Jonathan M Davis wrote: The problem is that the OP wants the static constructors to be skipped. If they're skipped, anything and everything which could be affected by that can't be used. That pretty much means not using TLS, since the compiler isn't going to be able to track down which variables in TLS will or won't be affected by it. So, you're stuck using shared memory only. _That_ is where the problem comes in. Exactly, not using TLS. You can still use the heap, as it is not thread local. Meaning you can create non-shared anything all you like, as long as you're not using TLS. Except that the 'heap' internally uses TLS. The GC does need and use TLS. Using D's TLS for the GC is an implementation choice, not a requirement. If someone wants to optimize 'spawn' for pure functions by skipping D's TLS initialization, he can make the GC and the array appending cache work with that. Not really, as _every_ modern GC requires TLS. And we're talking about a performance optimization here: not supporting modern GCs in order to remove TLS initialization would be a penny-wise pound foolish move. Besides, 'mini' threads shouldn't be created using OS threads; that's what thread-pools, fibers and tasks are for.
Re: Owned members
On Sat, 25 Dec 2010 14:23:47 -0700, Alex Khmara wrote: On Sat, 25 Dec 2010 14:42:48 -, bearophile wrote: spir: I would enjoy to see a concrete, meaningful, example. Often enough my class/struct members are arrays, and often I'd like the compiler to help me be more sure their memory is not shared (with a slice, for example) with something outside the instance. See below. It seems your point is to avoid sharing ref'ed elements --which is precisely the purpose of referencing, isn't it? That's one of the purposes of references, but there are other purposes. A dynamic array is allocated on the heap through a kind of fat reference so you are able to change its length. Objects in D are always managed by reference, so you have no choice. What is the sense of having referenced elements if the references are not to be shared? They are owned by the class instance :-) And even if you use emplace or scoped from Phobos, you still have a reference, so @owned is useful even for scoped objects. Then, you would need a tag like @owned to give back value semantics to referenced elements... Correct? @owned doesn't change the semantics and probably the resulting binary is unchanged. Its purpose is just to disable certain undesired (and buggy) behaviours, to keep class instance the only owner of the referenced object/array. What about the opposite (much rarer) case: It's another thing. Googling should point you to 2-3 articles by Bertrand Meyer I remember something by Meyer, but I don't remember if he was talking about instance ownership. I will read something again. Bye, bearophile I don' understand how this can be implemented in more complicated cases: class X { @owned private int[] foo; int[] f1() { auto fooSlice = foo[0...3]; // is this valid? someExternalFunc(foo); // is this allowed? someExternalFunc(fooSlice) // how about this? return someFuncReturningArrayArg(fooSlice); // how to detect this? } } It seems that @owned can work only in very primitive cases - otherwise complex escape analysis needed, and I even do not know if it will help. May be, if only pure function will be allowed to accept @owned arguments, it will be ok, but power of this feature will be severely limited in this case. This @owned is very similar to previous 'scope' proposals (and oddly dissimilar to previous owned proposals). To answer your question, under previous proposals the scope keyword would allow you to declare that a variable doesn't escape the current scope. So you could define external functions that would take a 'scope int[]' and be guaranteed that it wouldn't escape. (returning scoped values has to obey certain restrictions) The previous 'owned' proposals are a bit more general. Owned types allow you parameterize a type on a variable, which sounds complex, but what it means in terms of the runtime, is that all objects in the same ownership group shared the same monitor. The big advantage is that inside a synchronized method you don't have to synchronize other 'owned' objects of the same type. It also makes this like unique a lot more viable, since you can now define trees, etc.
Re: Owned members
On Sat, 25 Dec 2010 15:18:07 -0700, Alex Khmara wrote: On Sat, 25 Dec 2010 19:18:43 -, Robert Jacques wrote: This @owned is very similar to previous 'scope' proposals (and oddly dissimilar to previous owned proposals). To answer your question, under previous proposals the scope keyword would allow you to declare that a variable doesn't escape the current scope. So you could define external functions that would take a 'scope int[]' and be guaranteed that it wouldn't escape. (returning scoped values has to obey certain restrictions) The previous 'owned' proposals are a bit more general. Owned types allow you parameterize a type on a variable, which sounds complex, but what it means in terms of the runtime, is that all objects in the same ownership group shared the same monitor. The big advantage is that inside a synchronized method you don't have to synchronize other 'owned' objects of the same type. It also makes this like unique a lot more viable, since you can now define trees, etc. Ok, I'll try to find previous proposals about 'scope' - now I cannot understand how you can prevent external function from, e.g., saving owned object or array (i.e. references) in global variable, if it's at all possible to pass owned variables into external functions. Well, think about pure. A pure function can call other pure functions, because those functions declare that they obey the rules of pure (i.e. no globals, etc). Scope variables can be passed to functions taking scoped parameters because those functions declare that they'll obey the rules of scope (i.e. they won't squirrel away references to it, etc).
Re: Owned members
On Sat, 25 Dec 2010 16:38:19 -0700, Alex Khmara wrote: Well, think about pure. A pure function can call other pure functions, because those functions declare that they obey the rules of pure (i.e. no globals, etc). Scope variables can be passed to functions taking scoped parameters because those functions declare that they'll obey the rules of scope (i.e. they won't squirrel away references to it, etc). And so we have another complication in language - another transitive type modifier. Well, it's only a proposal at this point. And things regarding scope/ownership have been fairly explicitly delayed to D3.
Re: typeof(t) not working correctly?
On Sun, 26 Dec 2010 22:48:42 -0700, %u wrote: Hi, I'm running this code below, and it returns an array of length zero for both print statements. Is this a bug, or am I missing something? Thank you! private import std.stdio; private import std.traits; class Temp { public int base1; public void foo() { writeln("foo base called"); } } class Temp2 : Temp { public int derived1; public override void foo() { writeln("foo derived called"); } } void main() { Temp t = new Temp2(); writeln(typeid(t).offTi().length); //Prints 0 writeln(typeid(t).getMembers(null).length); //Prints 0 } typeid is working correctly, (it simply returns the TypeInfo for a type), but it is a well known issue that the offset type info array is not populated. It is bug 1348 (http://d.puremagic.com/issues/show_bug.cgi?id=1348). You can vote it up in Bugzilla if you'd like. In general, questions like this should be sent to D.learn, and I also generally recommend doing a quick search in Bugzilla just to see if you get a hit against a known issue.
Re: Clay language
On Mon, 27 Dec 2010 13:42:50 -0700, Guilherme Vieira wrote: On Mon, Dec 27, 2010 at 4:35 PM, bearophile wrote: Through Reddit I have found a link to some information about the Clay language, it wants to be (or it will be) a C++-class language, but it's not tied to C syntax. It shares several semantic similarities with D too. It looks like a cute language: https://github.com/jckarter/clay/wiki/ Some small parts from the docs: -- In Clay this: https://github.com/jckarter/clay/wiki/Syntax-desugaring static for (a in ...b) c; is equivalent to: { ref a = ; c; } { ref a = ; c; } /* ... */ I have an enhancement request about this for D: http://d.puremagic.com/issues/show_bug.cgi?id=4085 The part about Safer pointer type system is very similar to what I did ask for D, and it looks similar to what Ada language does (for Clay this is just a proposal, not implemented yet, but Ada is a rock-solid language): https://github.com/jckarter/clay/wiki/Safer-pointer-type-system This is something that I want for D too, it's important: >Jonathan Shapiro (of BitC) makes an excellent argument that, in a systems language, it is often undesirable to depend on the whims of an ill-specified optimizer to convert abstract code into efficient machine code. The BitC specification thus includes the idea of guaranteed optimizations, to allow code to be written in a high-level style with predictably low or nonexistent runtime cost (link). [...] Because Clay seeks to support systems programming with high-level abstraction, certain patterns should be guaranteed to be optimized in a certain way, instead of being left to the whims of LLVM or a C compiler. Additional optimizations should not be prevented, however. [...] It should be possible to specify that one or more of these optimizations is required, and have the compiler raise an error when they cannot be applied for some reason.< https://github.com/jckarter/clay/wiki/Guaranteed-optimizations Bye, bearophile +1 for static for and guaranteed optimizations. Can we put it in the wishlist? If you followed the bug report, you'd find D already has a way of doing static foreach.
Re: GC conservatism -- again
On Mon, 27 Dec 2010 09:12:53 -0700, Steven Schveighoffer wrote: While fixing a design issue in druntime, I re-discovered how crappy the conservative GC can be in certain situations. The issue involves the array appending cache, which is used to significantly speed up array appends. Essentially, since the array appending cache was scanned as containing pointers, it would 'hold hostage' any arrays that were appended to. As it turns out, this was necessary, because there was no mechanism to update the cache when the blocks were collected by the GC. I added this mechanism, and discovered -- it didn't help much :) The test case I was using was posted to the newsgroup a few months back. Basically, the test was to append to an array until it consumed at least 200MB. A single test takes a while, but what's more disturbing is, if you run the same test again, the memory used for the first test *isn't released*. My first thought was that the array append cache was holding this data hostage, but that was not the problem. The problem is that when you allocate 1/20th the address space of the process to one contiguous memory block, the chances that the conservative GC will detect a false pointer into that block are very high. What's worse, if the 'pointer' into the block is somewhere in TLS, global data, or high on the stack, that block is stuck for pretty much the life of the program or thread. So I was thinking of possible ways to solve this problem. Solving it perfectly is not really possible unless we implement precise scanning in all areas of memory (heap, stack, global data). While that certainly *could* be a possibility, it's not likely to happen any time soon. What about tools to make deallocation easier? For example, we have scope(exit) that you could potentially use to ensure a memory block is deallocated on exit from a scope, what about a thread exit? What about declaring a scope object at a high level that nested scopes could use to deallocate from? Making this a bit easier might be a good alternative while precise scanning hasn't been adopted yet. Any other ideas? -Steve First, I'd like to point out that precise scanning of the heap (and I'll assume this can be extended to globals), is a long standing enhancement request. It's issue 3463 (http://d.puremagic.com/issues/show_bug.cgi?id=3463). It does have a patch, but it's now out of date and needs someone to update it (hint, hint). Second, the false pointer problem disappears (for practical purposes) when you move to 64-bit. Third, modern GCs (i.e. thread-local GCs) can further reduce the false pointer issue.
Re: align(n) not working as expected
On Tue, 28 Dec 2010 00:32:37 -0700, %u wrote: Hi, I'm not sure if I'm doing something wrong, but it seems like struct alignment isn't really working. (I just found out that I'm not supposed to post this on the digitalmars.D.bugs newsgroup, so I'm posting it here.) When I execute this code: struct Temp { ubyte x; align(16) ubyte y; } auto o = Temp(); std.stdio.writefln("Address of aligned fields: %#x, %#x", cast(size_t)&o.x, cast(size_t)&o.y); I get these addresses: 0x18fd00, 0x18fd01 the second of which is not aligned on a 16-byte boundary. Am I doing something wrong, or is this a bug? Thank you! As per the docs, align behaves in the manner of the companion C++ compile. DMC only defines align(1) and align(4), so they're the only two that work. So this isn't a bug per say, but more than one of us has asked for align(8)/align(16) support. (or at least a compile time warning). But there's several technical/performance issues with maintaining alignment different from the underlying OS. I'd also recommend D.learn for questions like these.
Re: streaming redux
On Tue, 28 Dec 2010 00:02:29 -0700, Andrei Alexandrescu wrote: I've put together over the past days an embryonic streaming interface. It separates transport from formatting, input from output, and buffered from unbuffered operation. http://erdani.com/d/phobos/std_stream2.html There are a number of questions interspersed. It would be great to start a discussion using that design as a baseline. Please voice any related thoughts - thanks! Andrei Here are my initial thoughts and responses to the questions. Now to go read everyone else's. Re: TransportBase Q1: Internally, I think it is a good idea for transport to support lazy opening, but I'm not sure of the hassle/benefit reward for exposing this to user code. If open is supported, I don't think it should take any parameters. Q2: If seek isn't considered universal, having a isSeekable and rewind, might be beneficial. But while I know of transports where seeking might be slow, I'm not sure which one wouldn't support it at all, or only support rewind. Q3: Yes, to seek + tell and getting rid of seekFromXXX. Re: UnbufferedInputTransport Q1: I think that read should be allowed to return less than buffered length, but since the transport should know the most efficient way to block on an input, I don't think returning a length zero array is valid. Re: BufferedInputTransport Q1: I think it's valid for the front of a buffer input to be empty: an empty front simply means that popFront should be called. popFront should be required to fill at least some of front (See UnbufferedInputTransport Q1) Q2: Semantically, 'advance' feels to like popFront: I want to advance my input and I'm intending to work with it. The seek routines, on the other hand feel more like indexing: I want to do something with that index, but I do not necessarily need everything in between. In particular, I'd expect long seeks to reduce the front array to a zero elements, while I'd expect advance to enlarge the internal buffer if necessary. Re: Formatter Q1: I don't think formatters should be responsible for buffering, but certain formats require rather extensive buffering that can't be provided by the current buffer transport classes. (BSON comes to mind). My initial impression is that seek, etc should be able to handle these use cases, but adding a buffer hint setter/getter might be a good idea. The idea being that if the formatter knows that it will come back to this part of the stream, it can set a hint, so the buffer can make a more intelligent choice of when/where to flush internally. Q2: putln only makes sense in terms of text based streams, plus it adds a large number of methods to implement. So I'm a bit on the fence about it. I think writefln would be a better solution to a similar problem. Q3: The issue I see with a reflection-based solution is that the runtime reflection system should respect the visibility of the member: i.e. private variables shouldn't be accessible. But to do effective serialization, private members are generally required. As for the more technical aspects, combining __traits(derivedMembers,T) and BaseClassesTuple!T can determine which objects overload toString, etc. Q4: Reading/writting the same sub-object is an internal mater, in my opinion. The really important aspect is handling slices, etc nicely for formats that support cyclic graphs. For which, the only thing missing is put(void*) to handle pointers (I think). Q5: I think handling AA's with hooks is the best case with this design, though I only see a need for start and end. The major issue is that reading should be done as a tuple, which basically breaks the interface idiom. Alternatively, callbacks could be used to set read's mode: i.e. readKeyMode, readValueMode & putKeyMode, putValueMode. Q6: Well, toString and cast(int/double/etc), should go a long way to covering most of the printf specifiers Q7: Yes, writefln should probable be supported for text based transport. Re: Unformatter Q1: Implementations should be free (and indeed encouraged) to minimize allocations by returning a reusable buffer for arrays. So the stream should be responsible for inferring the size of an array. Q2: See Formatter Q3. Q3: See Formatter Q5. Other Formatter/Unformatter thoughts: For objects, several formats also require additional meta information (i.e. a unique string id, member offset position, etc), while others don't.
Re: D vs C++
On Tue, 28 Dec 2010 04:49:54 -0700, Max Samukha wrote: Another QVariant feature I would like to see in Variant is a constructor taking the type descriptor and a void pointer to the value. For example, it is needed for constructing Variants from variadic arguments. For what it's worth, I've been working on improving Variant and added this to my to do list when I read Issue 2846 a while ago. I've also checked it off the to do list :)
Re: streaming redux
On Tue, 28 Dec 2010 23:34:42 -0700, Andrei Alexandrescu wrote: On 12/28/10 11:54 AM, Sean Kelly wrote: Andrei Alexandrescu Wrote: On 12/28/10 5:09 AM, Vladimir Panteleev wrote: abstract interface Formatter; I'm really not sure about this interface. I can see at most three implementations of it (native, high-endian and low-endian variants), everything else being too obscure to count. I think it should be implemented as static structs instead. Also, having an abstract method for each native type is quite ugly for D standards, I'm sure there's a better solution. Nonono. Perhaps I chose the wrong name, but Formatter is really anything that takes typed data and encodes it in raw bytes suitable for transporting. That includes e.g. json, csv, and also a variety of binary formats. This one is really difficult to get right. JSON, for example, has named members of its object type. How could the name of a field be communicated to the formatter? The best I was able to do with C++ iostreams was to create an abstract formatter class that knew about the types I needed to format and have protocol-specific derived classes do the work. Here's some of the dispatching code: printer* get_printer( std::ios_base& str ) { void*& ptr = str.pword( printer::stream_index() ); if( ptr == NULL ) { str.register_callback(&printer_callback, printer::stream_index() ); ptr = new xml_printer(); } return static_cast( ptr ); } std::ostream& operator<<( std::ostream& os, const message_header& val ) { printer* ptr = get_printer( os ); return (*ptr)( os, val ); } Actually using this code to write data to a stream looks great: ostr<< header<< someobj<< anotherobj<< end_msg; but I'm not happy about how much specialized underlying code needs to exist. I guess what I'm saying is that a generic formatter may be great for simple formats like zip streams, CSV files, etc, but not so much for more structured output. That may be a sufficient goal for std.stream2, but if so I'd remove JSON from your list of possible output formats :-) I agree with the spirit. In brief, I think it's fine to have a Json formatter as long as data is provided to it as Json-friendly types (ints, strings, arrays, associative arrays). In other words, I need to simplify the interface to not attempt to format class and struct types - only built-in types. By the way, JSON doesn't support associative arrays in general. It only supports AA in the sense that JSON objects are an array of string:value pairs.
Re: GC conservatism -- again
On Wed, 29 Dec 2010 07:37:10 -0700, Steven Schveighoffer wrote: On Tue, 28 Dec 2010 01:23:22 -0500, Robert Jacques wrote: First, I'd like to point out that precise scanning of the heap (and I'll assume this can be extended to globals), is a long standing enhancement request. Yes, I know. Does it also do precise scanning of the stack and global/TLS data? Because that also needs to happen (I think you need a lot more compiler support for that) to really fix this problem. Globals and TLS should be possible, but the stack isn't without some major architectural changes (tagging+filtering, dual-stacks, etc). Second, the false pointer problem disappears (for practical purposes) when you move to 64-bit. I'm not sure I like this "solution", but you are correct. This is somewhat mitigated however by the way memory is allocated (I'm assuming not sparsely throughout the address space, and also low in the address space). It certainly makes it less likely that a 64-bit random long points at data, but it's not inconceivable to have 32-bits of 0 interspersed with non-zero data. It might be likely to have a struct with two ints back to back, where one int is frequently 0. Ah, but the GC can allocate ram in any section of the address space it wants, so it would be easy for the upper 32-bits to be always non-zero by design. Third, modern GCs (i.e. thread-local GCs) can further reduce the false pointer issue. I'd rather have precise scanning :) There are issues with thread-local GCs. The only issue with thread-local GCs is that you can't cast to immutable and then shared the result across threads. And eventually, well have a) better ways of constructing immutable and b) a deep idup, to mitigate this. If we have the typeinfo of a memory block for the GC to parse, you can also rule out cross-thread pointers without thread-local GCs (for unshared data). Which basically creates all the problems of thread-local GC, with very few of the advantages. For clarity's sake, I assume that thread-local GCs would be used in conjunction with a standard shared GC, in order to handle immutable and shared heap data correctly.
Re: GC conservatism -- again
On Wed, 29 Dec 2010 12:27:02 -0700, Steven Schveighoffer wrote: On Wed, 29 Dec 2010 14:00:17 -0500, Robert Jacques wrote: On Wed, 29 Dec 2010 07:37:10 -0700, Steven Schveighoffer wrote: On Tue, 28 Dec 2010 01:23:22 -0500, Robert Jacques wrote: Second, the false pointer problem disappears (for practical purposes) when you move to 64-bit. I'm not sure I like this "solution", but you are correct. This is somewhat mitigated however by the way memory is allocated (I'm assuming not sparsely throughout the address space, and also low in the address space). It certainly makes it less likely that a 64-bit random long points at data, but it's not inconceivable to have 32-bits of 0 interspersed with non-zero data. It might be likely to have a struct with two ints back to back, where one int is frequently 0. Ah, but the GC can allocate ram in any section of the address space it wants, so it would be easy for the upper 32-bits to be always non-zero by design. huh? How does the GC control whether you set one int to 0 and the other not? Hmm, perhaps an example would be best. Consider a 64-bit thread-local GC. It might allocate ram in the following pattern: [2-bit Shared/Local][16-bit thread ID][6-bit tag][40-bit pointer] So first, the address space is divided up into 4 regions, [11...] and [00...] are left free for user use (external allocation, memory-mapped files, etc), and [01...]/[10...] are used to denote shared/thread-local. Next you have the thread ID, which is one way to support thread-local allocation/thread-local GCs. Then you might have a 6-bit region that lock-free algorithms could use as a tag (and would be ignored by the shared GC), or local GCs could use for internal purposes. Finally, you have a 40-bit region with what we normally think of as a 'pointer'. The thing to understand, is that 64-bit computers are really 40-bit computers, currently. And given that 40-bits will hold us until we get to peta-bytes, we should have 24-bits to add meta-info to our pointers for some time to come. So, as long as we choose this meta-info carefully, we can avoid common bit patterns and the associated false pointers. Third, modern GCs (i.e. thread-local GCs) can further reduce the false pointer issue. I'd rather have precise scanning :) There are issues with thread-local GCs. The only issue with thread-local GCs is that you can't cast to immutable and then shared the result across threads. And eventually, well have a) better ways of constructing immutable and b) a deep idup, to mitigate this. I'm talking about collection cycles -- they necessarily need to scan both thread local and shared heaps because of the possibility of cross-heap pointing. Which means you gain very little for thread-local heaps. The only thing it buys you is you can assume the shared heap has no pointers into local heaps, and local heaps have no pointers into other local heaps. But if you can't run a collection on just one heap, there is no point in separating them. The defining feature of thread-local GCs is that they _can_ run collection cycles independently from each other. Plus it's easy to cast without moving the data, which is not undefined currently if you take the necessary precautions, but would cause large problems with separate heaps. Casting from immutable/shared would be memory valid, it's casting to immutable and shared where movement has to occur. As casting between immutable/shared and mutable is logically invalid, I'd expect that these cases would be rare (once we solve the problem of safely constructing complex immutable data) If we have the typeinfo of a memory block for the GC to parse, you can also rule out cross-thread pointers without thread-local GCs (for unshared data). Which basically creates all the problems of thread-local GC, with very few of the advantages. What are the advantages? I'm not being sarcastic, I really don't know. -Steve The major advantage is that they match or out-perform the modern generational/concurrent GCs, even when backed by conservative mark-sweep collectors (according to Apple). (TLGCs can be backed by modern collectors) Essentially, TLGCs work by separately allocating and collecting objects which are not-shared between threads. Since these collectors don't have to be thread safe, they can be more efficient in their implementation, and collections only have to deal with their subset of the heap and their own stack. This reduces pause times and false pointers, etc. TLGCs are inherently parallel and have interleaved pause times, which can greatly reduce the "embarrassing pause" effect (i.e. your worker thread running out of ram won't pause your GUI, or stop other workers from fulfilling http requests). In D, they become even more i
Re: PROPOSAL: Implicit conversions of integer literals to floating point
On Wed, 29 Dec 2010 23:46:19 -0700, Don wrote: BACKGROUND: D currently uses a very simple rule for parameter matching: * it matches exactly; OR * it matches using implicit conversions; OR * it does not match. There's an important extra feature: polysemous literals (those which can be interpreted in multiple ways) have a preferred interpretation. So 'a' is char (rather than wchar or dchar); 57 is an int (rather than short, byte, long, or uint); and 5.0 is a double (rather than float or real). This feature acts as a tie-breaker in the case of ambiguity. Notice that the tie-breaking occurs between closely related types. If you implement overloading on any two of the possibilities, you would always overload the preferred type anyway. (eg, it doesn't make sense to overload 'short' and 'uint' but not 'int'). So this all works in a satisfactory way. THE PROBLEM: Unfortunately, the tie-breaking fails for integer literals used as floating-point parameters. Consider: void foo(double x) {} void main() { foo(0); } This compiles correctly; 0 converts to double using implicit conversions. Now add: void foo(real x) {} void foo(float x) {} And now the existing code won't compile, because 0 is ambiguous. Adding such overloads is a common activity. It is totally unreasonable for it to break existing code, since ANY of the overloads would be acceptable. The language doesn't have any reasonable methods for dealing with this. The only one that works at all is to add a foo(int) overload. But it scales very poorly -- if you have 4 floating point parameters, you need to add 15 overloads, each with a different combination of int parameters. And it's wrong -- it forces you to allow int variables to be accepted by the function (but not uint, short, long !) when all you really need is for literals to be supported. And no, templates are most definitely not a solution, for many reasons (they break even more code, they can't be virtual functions, etc) This problem has already hit Phobos. We inserted a hack so that sqrt(2) will work. But exp(1) doesn't work. Note that the problems really arise because we've inherited C's rather cavalier approach to implicit conversion. PROPOSAL: I don't think there's any way around it: we need another level of implicit conversion, even if only for the special case of integer literals->floating point. From the compiler implementation point of view, the simplest way to do this would be to add a level between "exact" and "implicit". You might call it "matches with preferred conversions", or "match with literal conversions". A match which involves ONLY conversions from integer literals to double, should be regarded as a better match than any match which includes any other kind of implicit conversion. As usual, if there is more than one "preferred conversion" match, it is flagged as ambiguous. BTW, I do not think this applies to other literals, or other types. It only applies to polsemous types, and int literal->floating point is the only such case where there's no tie-breaker in the target type. It's a very special case, but a very common and important one. It applies to *any* overloaded floating point function. So I think that a special case in the language is justified. vote++.
Re: range practicle use
On Thu, 30 Dec 2010 05:41:18 -0700, spir wrote: Hello, In the course of a project (1) 2 partner D programmers and myself are currently implementing, we faced 2 issues which prevented us using a range interface as planned. We initially intended to do it for better compliance with D's coming new style, and nice inter-relation with Phobos modules like std.algorithm. Instead, we use opApply to implement traversal; which does the job. This post's purpose is to expose these problems to help and making range interfaces better usable, or simply usable, in various practical cases. (Opinions are mine --I don't known my partners' exact position, except that they indeed agree these topics plainly prevent us using ranges.) To illustrate the situation, please see an artificial example coded below (the real case would require explaining several irrelevant points). -1- textual output [snip] I too have been bitten by this bug. Specifically, when using opDispatch you have to remember to disable all of the range interfaces. -2- indexed iteration It seems there is no way to directly define indexed iteration using ranges, like commonly needed by: foreach(index,element ; collection) {...} A possible but inadequate workaround would be to define a tool type for this: struct TraversalPair (Element) {uint index ; Element element;} Then define the range's element output routines (opIndex, front, back) to return pairs instead of elements; and use this like: foreach (pair ; collection) {actWith(pair.element, pair.index);} But this requires a client programmer to know this particularity; and anyway does not fit D common style and practice, I guess. How to solve this practically? I would be happy with the above workaround if it became a commonly used solution, supported by the stdlib and if necessary by the core language. This may scale by defining the tool type as a superclass instead, allowing various variants, possibly with more elements. Maybe an alternative is to use tuples: allow variants of opIndex, front, and back, to return (index,element) tuples and let the compiler use these overloads when the client code requests 2 (or more) returned values. The first solution is more explicite, and possibly general; the second may fit common practice better if/when tuples become widely used (a rather far & hypothetical future ;-) ?). Thank you, Denis I'd prefer the tuple solution, particularly as tuples + syntactic sugar have been discussed before regarding multiple value return.
Re: property-like data members
On Sun, 02 Jan 2011 05:29:48 -0500, spir wrote: Hello, Using properties allows travesting a method call into direct data access. What if the underlying member actually is plain data? Would it be possible to provide a real data member where the language expects a property (for instance as range empty & front properties)? Yes, see the Uniform access principle (http://en.wikipedia.org/wiki/Uniform_access_principle). (Though UAP hasn't been discussed much on the newsgroup)
Re: Dynamic D
On Mon, 03 Jan 2011 22:19:49 -0500, spir wrote: On Mon, 3 Jan 2011 22:23:29 + (UTC) Adam Ruppe wrote: Over the weekend, I attacked opDispatch again and found some old Variant bugs were killed. I talked about that in the Who uses D thread. Today, I couldn't resist revisiting a dynamic kind of object, and made some decent progress on it. http://arsdnet.net/dcode/dynamic.d (You can compile that; there's a main() at the bottom of that file) It isn't quite done - still needs op overloading, and probably better errors, but it basically works. It works sort of like a Javascript object. Features: opDispatch and assignment functions: Dynamic obj; // assign from various types obj = 10; obj = "string"; obj.a = 10; // assign properties from simple types naturally // can set complex types with one compromise: the () after the // property tells it you want opAssign instead of property opDispatch obj.a() = { writefln("hello, world"); } // part two of the compromise - to call it with zero args, use call: obj.a.call(); // delegte with arguments works too obj.a() = delegate void(string a) { writeln(a); }; // Calling with arguments works normally obj.a("some arguments", 30); Those are just the basics. What about calling a D function? You need to convert them back to regular types: string mystring = obj.a.as!string; // as forwards to Variant.coerce // to emulate weak typing Basic types are great, but what about more advanced types? So far, I've implemented interfaces: interface Cool { void a(); } void takesACool(Cool thing) { thing.a(); } takesACool(obj.as!Cool); // it creates a temporary class implementing // the interface by forwarding all its methods to the dynamic obj I can make it work with structs too but haven't written that yet. I want to add some kind of Javascript like prototype inheritance too. I just thought it was getting kinda cool so I'd share it :) Waow, quite cool, indeed! I'll have a look at your code as soon as I can. Esp obj.a() = { writefln("hello, world"); } is a real mystery for me ;-) Well, obj.a() returns by ref, which is how that works. The expression itself is a zero arg delegate literal.
Re: Dynamic D
On Mon, 03 Jan 2011 17:23:29 -0500, Adam Ruppe wrote: Over the weekend, I attacked opDispatch again and found some old Variant bugs were killed. I talked about that in the Who uses D thread. Today, I couldn't resist revisiting a dynamic kind of object, and made some decent progress on it. http://arsdnet.net/dcode/dynamic.d (You can compile that; there's a main() at the bottom of that file) It isn't quite done - still needs op overloading, and probably better errors, but it basically works. It works sort of like a Javascript object. [snip] I've been working on an update to both std.json and std.variant. Previews of both are available here: https://jshare.johnshopkins.edu/rjacque2/public_html/ though they are still works in progress. Two of the big enhancements that you might be interested in are call support and opDispatch + reflection + prototype structs. To paraphrase your example: Variant v; v.a( 10 ); assert(v.a == 10); v.a( { writefln("hello, world"); } ); v.a.call; //To be replaced by opCall, once struct opCall is fixed (Bug 4053) v.a( delegate void(string a, int x) { foreach(i;0..x) writeln(i+1," ",a); } ); v.a("potatoes", 3); I've also stubbed out a prototype style object, but I haven't really tested it yet. Thoughts, comments and use/test cases are always welcomed.
Re: Dynamic D
On Thu, 06 Jan 2011 10:35:07 -0500, Andrei Alexandrescu wrote: On 1/6/11 1:22 AM, Robert Jacques wrote: On Mon, 03 Jan 2011 17:23:29 -0500, Adam Ruppe wrote: Over the weekend, I attacked opDispatch again and found some old Variant bugs were killed. I talked about that in the Who uses D thread. Today, I couldn't resist revisiting a dynamic kind of object, and made some decent progress on it. http://arsdnet.net/dcode/dynamic.d (You can compile that; there's a main() at the bottom of that file) It isn't quite done - still needs op overloading, and probably better errors, but it basically works. It works sort of like a Javascript object. [snip] I've been working on an update to both std.json and std.variant. Previews of both are available here: https://jshare.johnshopkins.edu/rjacque2/public_html/ though they are still works in progress. Two of the big enhancements that you might be interested in are call support and opDispatch + reflection + prototype structs. To paraphrase your example: Variant v; v.a( 10 ); assert(v.a == 10); v.a( { writefln("hello, world"); } ); v.a.call; //To be replaced by opCall, once struct opCall is fixed (Bug 4053) v.a( delegate void(string a, int x) { foreach(i;0..x) writeln(i+1," ",a); } ); v.a("potatoes", 3); I've also stubbed out a prototype style object, but I haven't really tested it yet. Thoughts, comments and use/test cases are always welcomed. I think this transgresses the charter of Variant. Variant is meant to hold an object of some _preexisting_ type, not to morph into anything. We should have three abstractions: And Variant still only holds an object of some preexisting type. What you are seeing is simply syntactic sugar for a Variant of type Variant[string]. The above lowers down into: Variant v; Variant[string] __temp; v = __temp; v["a"] = 10; assert(v["a"] == 10); v["a"] = { writefln("hello, world"); }; v["a"].call(); v["a"] = delegate void(string a, int x) { foreach(i;0..x) writeln(i+1," ",a); }; v["a"].call("potatoes", 3); The only morph happens because actually making the Variant default type be Variant[string], has some issues (GC interaction, hasValue, Variant[string].init isn't usable, etc). So I decided that if and only if you used an uninitialized Variant as a Variant[string], it would 'morph' to a Variant[string]. As for the v.a -> v["a"] syntactic sugar, I have found it very useful in the parsing/use of dynamically structured structs, including JSON. * Algebraic holds any of a closed set of types. It should define method calls like a.fun(args) if and only if all of its possible types support the call with compatible arguments and return types. I have considered this, but while this concept looks good on paper, in practice it cripples Algebraic. The issue is that the intersections of types tend to have no methods/operators in common. For example, Algebraic!(int,string) would have no methods nor operators defined. * Variant holds any of an unbounded set of types. Reflection may allow us to define v.fun(args) to look up the method name dynamically and issue a runtime error if it doesn't exist (sort of what happens now with operators). It's not 'may' anymore. Reflection _does_ allow me to define v.fun(args) to look up the method name dynamically and issue a runtime error if it doesn't exist. eg: class Foo { real x = 5; } auto foo = new Foo; Variant a = foo; assert(a.x == 5); // perform runtime reflection on 'a' implicitly a.__reflect("x",Variant(10)); // or explicitly assert(a.__reflect("x") == 10); * Dynamic is a malleable type that you get to add state and methods to, just like in Javascript. And I have stubbed out a Prototype object for just this reason.
Re: Templates vs CTFE
On Thu, 06 Jan 2011 12:49:19 -0500, Max Samukha wrote: Some of us who have the knack of writing metaprograms in D know that many algorithms can be implemented with both recursive templates and CTFE. A simple example is map: Recursive template instantiation: template staticMap(alias pred, A...) { static if (A.length) alias TypeTuple!(pred!(A[0]), staticMap!(A[1..$])) staticMap; } CTFE: template staticMap(alias pred, A) { mixin("alias TypeTuple!(" ~ _staticMap(A.length) ~ ") staticMap;"); } private string _staticMap(size_t len) { string result; if (len) { result ~= "pred!(A[0])"; for(size_t i = 1, i < len; ++i) { result ~= ", pred!(A[" ~ to!string(i) ~ "])"; } } return result; } It is not easy to decide which approach to implement in a library because both have drawbacks. Can anybody give informed advice as to which one is preferable in practice? Or should a library provide both? CTFE is generally easier to write/understand and more generic than doing the same thing in templates. However, there are some serious flaws in how DMD currently handles CT strings (and arrays in general) which can lead extremely complex CTFE code to be incorrect, very slow to compile or crash DMD outright. For example, I initially implemented a compile-time reflection to runtime-time reflection system for an improved std.variant using CTFE, but had to switch to templates/conditional compilation instead. (see https://jshare.johnshopkins.edu/rjacque2/public_html/) See bug 1382 (http://d.puremagic.com/issues/show_bug.cgi?id=1382).
Re: Dynamic D
On Thu, 06 Jan 2011 13:24:37 -0500, Andrei Alexandrescu wrote: On 1/6/11 11:52 AM, Robert Jacques wrote: And Variant still only holds an object of some preexisting type. What you are seeing is simply syntactic sugar for a Variant of type Variant[string]. The above lowers down into: Variant v; Variant[string] __temp; v = __temp; v["a"] = 10; assert(v["a"] == 10); v["a"] = { writefln("hello, world"); }; v["a"].call(); v["a"] = delegate void(string a, int x) { foreach(i;0..x) writeln(i+1," ",a); }; v["a"].call("potatoes", 3); The only morph happens because actually making the Variant default type be Variant[string], has some issues (GC interaction, hasValue, Variant[string].init isn't usable, etc). So I decided that if and only if you used an uninitialized Variant as a Variant[string], it would 'morph' to a Variant[string]. I think Variant should default to holding "void". It does, with the enhancement that assignment/initialization can occur by 'member' assignment. Personally, I think assigning to a void Variant should be as permissive as possible, but this behavior is trivial to remove if needs be. As for the v.a -> v["a"] syntactic sugar, I have found it very useful in the parsing/use of dynamically structured structs, including JSON. That's great, but (a) that's again outside what Variant is supposed to do, and (b) for JSON we're dealing with a closed hierarchy which suggests a different design. And JSON is supposed to be implemented _as_ a variant; it's one of the major use cases and feature litmus tests. Besides, Variant is supposed to "interfacing with scripting languages, and [allow] comfortable exploratory programming", and since this facilitates both those design goals of variant, I think this is within scope. * Algebraic holds any of a closed set of types. It should define method calls like a.fun(args) if and only if all of its possible types support the call with compatible arguments and return types. I have considered this, but while this concept looks good on paper, in practice it cripples Algebraic. The issue is that the intersections of types tend to have no methods/operators in common. For example, Algebraic!(int,string) would have no methods nor operators defined. Algebraic with the fundamental JSON types is a great example because they all may share certain methods. Except that they don't share _any_ methods. That's the point I was trying to make. Generally Algebraic with closed hierarchies (e.g. Visitor) are good candidates for the feature. Since you used 'closed hierarchies' to describe JSON, what exactly do you mean by it? 'closed hierarchies' to me implies an inheritance-like relationship between, which generally means that Algebraic is the wrong choice (i.e. why not use a common interface or super-type?) [snip] * Dynamic is a malleable type that you get to add state and methods to, just like in Javascript. And I have stubbed out a Prototype object for just this reason. Great. Why not call it "Dynamic"? Because it implements prototype based programming and not dynamic programming. (http://en.wikipedia.org/wiki/Prototype_based_programming) Andrei
Re: Immutable nested functions
On Fri, 07 Jan 2011 17:18:39 -0500, Don wrote: Tomek Sowin'ski wrote: A while ago I pointed out that the result of an immutably pure function (all arguments immutable, doesn't mutate globals) can be safely converted to immutable. More here: http://d.puremagic.com/issues/show_bug.cgi?id=5081 It helps with building complex immutable structures. Problem is, virtually every construction site is different so one is forced to define a new initializer function every time. To illustrate: void main() { immutable Node* init_leaf = ... ; uint breadth = ... ; immutable Node* tree = grow_tree(init_leaf, breadth); } Node* grow_tree(immutable Node* init_leaf, uint breadth) pure { Node* root = new Node; foreach (0..breadth) { Node* leaf = new Node(init_leaf); leaf.parent = root; root.leaves ~= leaf; } return root; } I tried to find a way to create ad-hoc functions conveniently. Naturally, I turned to nested functions: void main() { immutable Node* init_leaf = ... ; uint breadth = ... ; Node* grow_tree() pure immutable { Node* root = new Node; foreach (0..breadth) { Node* leaf = new Node(init_leaf); leaf.parent = root; root.leaves ~= leaf; } return root; } immutable Node* tree = grow_tree(); } Nested functions to be immutably pure must also guarantee that nothing gets mutated through its stack frame pointer. But there's a problem -- the compiler won't accept 'immutable' on a nested function. I think it should -- just like an immutable member function (e.g. in a class) is allowed to play only with immutable members of that class, an immutable nested function should be allowed to play only with the immutable members of the stack frame. It may seem a lesser change but I'm pretty excited as it solves the long-standing problems with immutable structure initialization. Excitement aside, I got questions: 1. I'm proposing to proliferate the concept of immutability from 'this' reference to a stack frame pointer. Although I'm confident it makes sense as these are interchangeable in delegates, I could use some criticism to either reject or strengthen the idea. 2. What about delegates? Should there be means to express a "delegate that doesn't mutate through its 'this'/stack frame pointer"? What should the syntax be for defining such delegate type and for lambdas (delegate literals)? 3. (vaguely related) Should there be means to express annotated delegates in general (e.g. pure, nothrow). There is. int delegate(int) pure square = cast( int delegate(int z) pure ) (int z) { return z*z; }; What about annotated lambdas? See the line above -- at the moment it requires a cast. Yuck yuck yuck. The other option would be for the compiler to automatically determine purity. I believe it always has access to the source code of the lambda, so there should be no problem to determine purity. Umm... int delegate(int) pure square = delegate(int z) pure { return z*z; }; compiles and runs fine. What doesn't compile is (int z) pure { return z*z; }; or anything similar.
Re: -0 assigned to a FP variable
On Sun, 09 Jan 2011 20:39:12 -0500, Jonathan M Davis wrote: On Sunday 09 January 2011 16:27:11 bearophile wrote: A bug I've introduced once in my D code, are you able to spot it? void main() { double x = +0; double y = -0; } The bug: 'y' isn't the desired double -0.0 To avoid this bug DMD may keep the -0 and +0 integer literals represented in two distinct ways, so they have two different values when/if assigned to a floating point. (But here D will behave a little differently from C). An alternative is to just add a warning to DMD. A third possibility is to just ignore this probably uncommon bug :-) Bye, bearophile I didn't even know that there _was_ such as thing as + or - 0 (or 0.0). I would have considered it a no-op, being identical to 0 or 0.0, and expected it to be compiled out completely. I haven't a clue what -0.0 would even mean. But I'm not exactly an expert on floating point values, so presumably, there's some weird floating point thing that - affects. -0.0 is an artifact of the floating point sign bit: i.e. there is a + and - for each value, so naturally there's also + and - 0. The difference isn't generally something you care about, as they are practically identical except in bit pattern (i.e. assert( 0.0 == -0.0 )).
Re: Constructors (starstruck noob from C++)
On Thu, 20 Jan 2011 22:02:42 -0500, Andrei Alexandrescu wrote: On 1/20/11 7:18 PM, Luke J. West wrote: Hi to all from a total noob. first of all, I'd like to say how impressed I am with D. In fact, I keep pinching myself. Have I *really* found a language worth leaving C++ for after two decades? It's beginning to look that way. Obviously I'm devouring the 2.0 documentation right now, but have not yet found out how to create a new instance of an existing class object. What I mean is, given... auto a = new A; how do I, in c++ speak, do the D for... A b(a); // or if you prefer... A* b = new A(a); I'm sure this must be trivial. Many many thanks, Luke Hopefully this won't mark a demise of your newfound interest :o). If A were a struct, auto b = new A(*a) would do. For classes, D does not provide automatic copy constructors; you need to define and follow a sort of cloning protocol. That being said, it's not difficult to define a generic function that copies fields over from one class object to another. Here's a start: import std.stdio; void copyMembers(A)(A src, A tgt) if (is(A == class)) { foreach (e; __traits(allMembers, A)) { static if (!is(typeof(__traits(getMember, src, e)) == function) && e != "Monitor") { __traits(getMember, tgt, e) = __traits(getMember, src, e); } } } class A { int x = 42; string y = "hello"; final void fun1() {} void fun2() {} static void fun3(){} } void main() { auto a = new A; a.x = 43; auto b = new A; copyMembers(a, b); assert(b.x == 43); } I think copyMembers belongs to the standard library. I wanted to define a family of functions like it but never got around to it. Andrei First, why not use tupleof? b.tupleof = a.tupleof; works perfectly fine, simpler and ahem, actually works. __traits(getMember, ...) has to obey scoping rules, so using it with a class that defines private variables results in a message like class hello.A member x is not accessible. Furthermore, you need to filter allMembers by a lot more than just function and "Monitor" as it also includes enum constants, etc. Having tried using it for serialization, I know it's non-trivial to use correctly, if you only want the actual data fields. i.e. void copyMembers(A)(A src, A tgt) if (is(A == class)) { tgt.tupleof = src.tupleof; }
Re: Constructors (starstruck noob from C++)
On Fri, 21 Jan 2011 08:16:24 -0500, spir wrote: On 01/21/2011 06:28 AM, Robert Jacques wrote: void copyMembers(A)(A src, A tgt) if (is(A == class)) { tgt.tupleof = src.tupleof; } What about this feature in Object under name "copy" or "dup"? Sure, it's not to be used evereday; but it's typcally the kind of routine that, when needed, we're very happy to find. And as shown by this thread the solution is clearly non-obvious (lol). By the way, why "dup" in D, instead of most common "copy" or "clone"? Is it also a legacy name? (Don't tell me we got this one from stack-based languages like Forth ;-) Anyway the semantics are totally different (*)). Denis (*) for very curious people: concatenative languages: http://concatenative.org/wiki/view/Concatenative%20language _ vita es estrany spir.wikidot.com ".dup" comes from arrays, which already have a ".dup" property which copies/clones them.
Re: Possible bug in std.algorithm.map
On Sat, 29 Jan 2011 11:12:28 -0500, Magnus Lie Hetland wrote: Hi! Just read Andrei Alexandrescu's new book, and I'm starting to experiment with using D in my algorithms research. Loved the book, and I'm loving the language so far :D I just hit a snag, though ... I was doing something simple, for which my prototype code (in Python) was d, u = max((D(u,v), v) for v in V) I first started writing it explicitly with loops, but it got a bit too verbose for my taste. Thought I'd use map and reduce, perhaps (although I'm still not sure if that's practical, as I'm reducing with max, but I'd like the argmax as well...). Anyway -- while using attempting to use map, I suddenly got a segfault. As I hadn't really done any crazy stuff with pointers, or circumvented the bounds checks or the like, I was a bit surprised. I have now boiled things down to the following little program: import std.algorithm; void f() { auto x = 0; double g(int z) { // Alt. 1: return int auto y = x; // Alt. 2: remove this return 0; } auto seq = [1, 2, 3]; auto res = map!(g)(seq); } void main() { f(); } When I compile and run this (dmd 2.051, OS X 10.5.8), I get a segmentation fault. Oddly enough, if I *either* change the return type to int *or* remove the "y = x" line, things work just fine. Am I correct in assuming this is a bug? Yes, it's Issue 5073. (http://d.puremagic.com/issues/show_bug.cgi?id=5073). I've tested your test case using the listed patch + DMD 2.051 and it works. The issue used to be a bad compile time error in earlier compiler versions but in DMD 2.051 it turned into a runtime error. The underlying error in DMD has to do with an alias of a delegate having a bad hidden pointer. The reason commenting out 'auto y = x;' works is that the incorrect hidden pointer is never called, thus never causing a segfault (IIRC). Issue 5073's patch works by passing delegates by value instead of by alias. Looking over Issue 5064, Don is probably right in it being the root cause in DMD, but if you just want map to work correctly, you might want to try the patch from 5073.
Re: review for unittests
On Sun, 30 Jan 2011 02:45:24 -0500, Andrei Alexandrescu wrote: I understand. I hope you also understand that your argument has only subjective basis, with you as the subject. You are literally only the second or third fellow coder to ever tell me such. Well, for your statistics, let me be the forth. :) But honestly, this is mostly due to the fact I'm generally using a high-res monitor which doesn't rotate, so my editor is about 150-chars wide but vertically narrow. (and that I will routinely break my own style, if it makes the code more comprehensible/readable to me). Of course, one of the nice things about a 150-char editor, is the 'two column' layout of 'code here; // Comment/doc over here'.
Re: Variants array and IDK
On Mon, 31 Jan 2011 02:04:11 -0500, g g wrote: IDK where to put this. first thing: Could it be a way to collapse Variants in std.variant?, like Variant x = Variant(1) Variant y = Variant([x,x,x,x]) //y could be simplified to Variant(int[]) instead of Variant(Variant(int)[]) //now it is not implemented (as far i know) Also if want to know if using Variant internally for a scripting language is recommendable ( or acceptable) I am facing the problem? writing my weird language ( https://github.com/zkp0s/Dw fell free to comment on it __not finished__) Excuse me if I wrote in the wrong newsgroup. I didn't know where to put it, .learn .D or .annouce ( Dw) Thanks for reading . One of Variant's design goals is to be used for interfacing/implementing scripting languages. So, yes, I'd say it's both recommended and acceptable. In fact, I'd encourage you to post to the newsgroup or bugzilla any rough edges you run into with variant. I've been working on an update to it which fixes all current bugzilla issues and implements a runtime-reflection system. (https://jshare.johnshopkins.edu/rjacque2/public_html/variant.mht, https://jshare.johnshopkins.edu/rjacque2/public_html/variant.d)
Re: C# Interop
On Mon, 31 Jan 2011 16:25:11 -0500, Eelco Hoogendoorn wrote: Hi all, At work I currently develop in C++ with a C# front-end, using CLI interop. Ive been using D for quite a while for hobby projects, and needless to say, it makes C++ feel like masochism. Id like to convince my coworkers that adding D in the mix is more than just another complication. But im not quite sure what would be the best way to do C# and D interop. CLI will no longer do me any good I fear. Do I just create a D DLL with a bunch of free extern(C) function for communication? Basically, yes. I've been doing this for a project and it works reasonably well. I have one module based on From D's/bugzilla's public domain example code to properly enable the dll and then declare the interop functions as export extern(C) {}. There's also some general helper functions you'll want to write. Although exceptions can propagate to .NET, they just turn into a System.Runtime.InteropServices.SEHException exception, so you'll want to wrap all your export function in a try-catch block and save the error message to a global variable. Then you can call a lastError function to get the actual error string. Oh, and remember .NET defaults to wstrings, not strings. The other helper function you'll want is a way to pin and unpin objects as D's GC can't see C#'s memory. What about marshalling? C# marshaling, though I'm glad it's there, involves pulling teeth to do anything other then calling basic C system calls. Lucky, you really only need it for arrays and structs. Objects can be be treated as handles inside a proxy C# object. And then you can handle methods as free functions whose first argument is the object's handle. But you'll also have to write a C# proxy object + pin/unpin the D object. If you're doing a lot of this, I'd recommend writing a mixin to generate all the free functions on the D side, and looking into the dynamic language features of C# to write an auto-wrapping proxy object. (I haven't needed to do this yet) Is using unsafe pointers back and forth the best I can do? C# can read from a pointer allocated by D in unsafe mode, and D can read from a pinned C# pointer, right? I don't know if it's the best you can do, but it does work. (no, im not looking for D.NET; what I miss in C# is to-the-metal / compilation.) Lastly, D DLLs will only work on Vista/Windows 7/later. They will not work on XP. This is due to a long known bug with DLLs and thread local storage in general on XP. Also, you'll have to use 32-bit C# currently, as DMD isn't 64-bit compatible yet. (Walter is hard at work on a 64-bit version of DMD, but it will be Linux only at first, with Windows following sometime later) I've listed some example code from my project below: // Written in the D Programming Language (www.digitalmars.com/d) ///Basic DLL setup and teardown code. From D's/bugzilla's public domain example code. module dll; import std.c.windows.windows; import std.c.stdlib; import core.runtime; import core.memory; extern (Windows) BOOL DllMain(HINSTANCE hInstance, ULONG ulReason, LPVOID pvReserved) { switch (ulReason) { case DLL_PROCESS_ATTACH: Runtime.initialize(); break; case DLL_PROCESS_DETACH: Runtime.terminate(); break; case DLL_THREAD_ATTACH: case DLL_THREAD_DETACH: return false; } return true; } D code: private { wstring last_error_msg = "No error";/// The last encountered by the dose server __gshared Object[Object] pinned;/// Hash of pinned objects void pin(Object value) { pinned[ value ] = value; } /// Pin an object /// Stores a string as the last error void lastError(string str) { wstring err; auto app = appender(&err); foreach (dchar c; str) app.put(c); last_error_msg = err; enforce(false); } } export extern(C) { ref wstring lastError() { return last_error_msg; } /// returns: the last error message, used for exception marshaling } C# code: [DllImport(gpu_dll)] [return: MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef = typeof(DString))] private static extern string lastError(); /// /// A custom marshaller for D's strings (char[]) passed by ref /// private class DString : ICustomMarshaler { //[ThreadStatic] //private static ICustomMarshaler _instance = new DString(); private IntPtr ptr = IntPtr.Zero; /// /// Factory method /// /// /// static ICustomMarshaler GetInstance(String pstrCookie) { return new DString(); } /// /// Convert a pointer to a D array to a managed T[] ///
Re: Variants array and IDK
On Mon, 31 Jan 2011 02:04:11 -0500, g g wrote: IDK where to put this. first thing: Could it be a way to collapse Variants in std.variant?, like Variant x = Variant(1) Variant y = Variant([x,x,x,x]) //y could be simplified to Variant(int[]) instead of Variant(Variant(int)[]) //now it is not implemented (as far i know) // Although this has to be true assert( y.type == typeid(Variant[]) ); // Should coerce support such lowering/conversions/collapses? i.e. int[] z = y.coerce!(int[])();
Re: C# Interop
On Mon, 31 Jan 2011 22:32:35 -0500, Andrej Mitrovic wrote: On 2/1/11, Robert Jacques wrote: Lastly, D DLLs will only work on Vista/Windows 7/later. They will not work on XP. This is due to a long known bug with DLLs and thread local storage in general on XP. Is there a bugzilla link for this? This isn't a D issue. It's a well documented problem with thread local storage with all DLLs on Windows XP. I haven't seen a specific bugzilla on this issue, but there are several issues with writting/loading DLLs in D, which I assume will be addressed at/about the time Linex .SOs are, which are listed as next/in progress after 64-bit support.
Re: C# Interop
On Tue, 01 Feb 2011 03:05:13 -0500, Rainer Schuetze wrote: Robert Jacques wrote: On Mon, 31 Jan 2011 16:25:11 -0500, Eelco Hoogendoorn wrote: [...] Lastly, D DLLs will only work on Vista/Windows 7/later. They will not work on XP. This is due to a long known bug with DLLs and thread local storage in general on XP. Also, you'll have to use 32-bit C# currently, as DMD isn't 64-bit compatible yet. (Walter is hard at work on a 64-bit version of DMD, but it will be Linux only at first, with Windows following sometime later) XP TLS support with dynamically loaded DLLs is fixed for some time now with a workaround implemented in druntime. Also, DLLs can be used in multi-threading environments. Yes, I pointed out in another thread that D loading D DLLs can work around this issue, but the original post was about calling a D DLL from another language, specifically C#, where the limitation in XP still exists. (Of course, you might be able to port the work around to C#. Hmm...) > I've listed some example code from my project below: [snip] This DLLMain code is a bit outdated (is it D1?), the current proposed version is here: http://www.digitalmars.com/d/2.0/dll.html Thanks. It was D2, but it was forked a while ago. Given that the recommended way of doing this might change in the future, a string mixin in core.dll_helper might be appropriate.
Re: C# Interop
On Tue, 01 Feb 2011 13:33:20 -0500, Rainer Schuetze wrote: Robert Jacques wrote: On Tue, 01 Feb 2011 03:05:13 -0500, Rainer Schuetze wrote: XP TLS support with dynamically loaded DLLs is fixed for some time now with a workaround implemented in druntime. Also, DLLs can be used in multi-threading environments. Yes, I pointed out in another thread that D loading D DLLs can work around this issue, but the original post was about calling a D DLL from another language, specifically C#, where the limitation in XP still exists. (Of course, you might be able to port the work around to C#. Hmm...) The workaround is not about D loading a D DLL. Visual D lives happily in the C++/C# world of Visual Studio, even on XP. It's the magic inside dll_process_attach() that sets up TLS for existing threads and patches the loader structures to make XP think the DLL was loaded at startup (where implicite TLS works). The downside is that the DLL cannot be unloaded, though. Thanks, again. Though the pros and cons of this should be listed in the docs somewhere. > I've listed some example code from my project below: [snip] This DLLMain code is a bit outdated (is it D1?), the current proposed version is here: http://www.digitalmars.com/d/2.0/dll.html Thanks. It was D2, but it was forked a while ago. Given that the recommended way of doing this might change in the future, a string mixin in core.dll_helper might be appropriate. I don't like mixins too much, but a standard_DllMain that you can forward to from DllMain, might be a good idea to include into the runtime library. Yes, on second thought, a standard_DllMain is the better solution.
Re: Calling method by name.
On Wed, 02 Feb 2011 12:55:37 -0500, %u wrote: I know is possible to create an object from its name. It's possible to call a method from that object if the name is only known at runtime? Would something like the following be possible? string classname, methodname; // Ask the user for class and method. auto obj = Object.factory(classname); invoke(methodname, obj, param1, param2); Thanks I've been working on an update to std.variant, which includes a compile-time reflection to runtime-reflection system. (See https://jshare.johnshopkins.edu/rjacque2/public_html/) From the docs: Manually registers a class with Variant's runtime-reflection system. Note that Variant automatically registers any types it is exposed. Note how in the example below, only Student is manually registered; Grade is automatically registered by Variant via compile-time reflection of Student. module example; class Grade { real mark; } class Student { Grade grade; } void main(string[] args) { Variant.__register!Student; Variant grade = Object.factory("example.Grade"); grade.mark(96.6); assert(grade.mark == 96.6); } And dynamic method/field calls are handled via the __reflect(string name, Variant[] args...) method like so: grade.__reflect("mark",Variant(96.6)); assert(grade.__reflect("mark") == 96.6);
Re: Calling method by name.
On Thu, 03 Feb 2011 08:49:54 -0500, Jacob Carlborg wrote: On 2011-02-03 05:52, Robert Jacques wrote: On Wed, 02 Feb 2011 12:55:37 -0500, %u wrote: I know is possible to create an object from its name. It's possible to call a method from that object if the name is only known at runtime? Would something like the following be possible? string classname, methodname; // Ask the user for class and method. auto obj = Object.factory(classname); invoke(methodname, obj, param1, param2); Thanks I've been working on an update to std.variant, which includes a compile-time reflection to runtime-reflection system. (See https://jshare.johnshopkins.edu/rjacque2/public_html/) From the docs: Manually registers a class with Variant's runtime-reflection system. Note that Variant automatically registers any types it is exposed. Note how in the example below, only Student is manually registered; Grade is automatically registered by Variant via compile-time reflection of Student. module example; class Grade { real mark; } class Student { Grade grade; } void main(string[] args) { Variant.__register!Student; Variant grade = Object.factory("example.Grade"); grade.mark(96.6); assert(grade.mark == 96.6); } And dynamic method/field calls are handled via the __reflect(string name, Variant[] args...) method like so: grade.__reflect("mark",Variant(96.6)); assert(grade.__reflect("mark") == 96.6); Why would you need to pass in Variants in __reflect? Why not just make it a variadic method and automatically convert to Variant? Well, opDispatch does exactly that. __reflect, on the other hand, was designed as a quasi-backend function primarily for a) internal use (hence the double underscore), b) scripting language interfacing/implementing and c) user-extension. So efficiency was of key importance. And the reflection system is extensible, as Variant knows to call __reflect on user defined types. This makes things like prototype style objects possible. (There's even a beta implementation of a prototype object in the library) But this requires that the use __reflect methods not be templated. I'm not well versed in dynamic reflection and its use cases, so when I considered the combination of a runtime method name and compile-time argument type information, I classed it as 'rare in practice'. But if that's not the case, I'd like to know and would greatly appreciate a use case/unit test.
Re: Uniform call syntax for implicit this.
On Thu, 03 Feb 2011 13:42:30 -0500, Jonathan M Davis wrote: On Thursday, February 03, 2011 09:54:44 Michel Fortin wrote: On 2011-02-03 12:43:12 -0500, Daniel Gibson said: > Am 03.02.2011 15:57, schrieb Michel Fortin: >> On 2011-02-02 23:48:15 -0500, %u said: >>> When implemented, will uniform call syntax work for the "this" >>> object even if not specified? >>> >>> For example, will foo() get called in the following example? >>> >>> void foo(A a, int b) {} >>> >>> class A { >>> void test() { >>> this.foo(10); >>> foo(10); >>> } >>> } >> >> I think it should work. > > I think foo(10) should *not* be equivalent to foo(this, 10). Personally, I'm not sure whether the uniform call syntax will be this much useful or not, but if it gets implemented I think foo(10) should be equivalent to foo(this, 10) in the case above. That said, it should not be ambiguous: if there is a member function foo and a global function foo and both matches the call, it's ambiguous and it should be an error. Can this work in practice? We probably won't know until we have an implementation to play with. Except that if you have both a member function foo and a free function foo, how can you tell the compiler which to use? *sigh* The same way you do it today, but using the outer-scope operator '.'. So foo(10) would call the member function, while .foo(10) would call the outer function, which would be re-written as .foo(this,10). An example of this today in D: import std.conv; class Foo { string text; string toString() {return .text("My text is: "text);} }
Re: Why does std.variant not have a tag?
On Sunday, 4 November 2012 at 22:33:46 UTC, Alex Rønne Petersen wrote: On 05-11-2012 00:31, evansl wrote: http://dlang.org/phobos/std_variant.html says: This module implements a discriminated union type (a.k.a. tagged union, algebraic type). Yet, the wiki page: http://en.wikipedia.org/wiki/Tagged_union says: a tag field explicitly indicates which one is in use. and I don't see any indication of a tag field in the std_variant.html page. Another wiki reference: http://en.wikipedia.org/wiki/Disjoint_union is more explicit because it pairs the tag with the value: (x,i) where x is the value and i is the tag. One reason for an explicit tag is that the bounded types may contain the same type twice. This has lead to problems in boost::variant as evidenced by the post: http://article.gmane.org/gmane.comp.parsers.spirit.general/17118 In addition, both variant and tuple have a common part, a metafunction mapping from a tag to a type; hence, this same common part could be used to implement both tuple and a tagged variant. A variant which actually contained a tag field I think would be more general in that it would allow duplicate types among the bounded types just as a tuple's bounded types can contain duplicate types. -regards, Larry Yes, this is a big problem with the current std.variant implementation (among other problems such as no recursive variants). The best std.variant can offer right now is the 'type' property to identify what's stored in it. std.variant is, unfortunately, not very useful if you want the semantics of variants in ML-style languages. I've been working on an update to std.variant whose formal submission has been held up by a PhD thesis and some health issues, although I'm currently (slowly) doing a code review/cleanup of it in the hope of finally submitting it. ( Old code: https://jshare.johnshopkins.edu/rjacque2/public_html/ ) Anyways, my implementation has an internal 'tag' as does the current implementation, IIRC. However, as the tag is a meaningless random number, instead .type is provide access to the meaningful typeinfo object of that class. And .type provides most (all?) of the functionality of an int id: auto var = Variant(5); if(var.type == typeid(int)) { // Do something... } else if(var.type == typeid(string)) { // Do something else... } But I am missing something as I found that the linked post wasn't clear what the exact issue was, only that there was an issue. If someone would like to clarify the problem (or any other with Variant) it would be appreciated.
Re: Why does std.variant not have a tag?
On Monday, 5 November 2012 at 14:13:41 UTC, evansl wrote: On 11/05/12 00:33, Robert Jacques wrote: On Sunday, 4 November 2012 at 22:33:46 UTC, Alex Rønne Petersen wrote: On 05-11-2012 00:31, evansl wrote: [snip] If std.Algebraic is like Boost.Variant, then duplicate bounded types are not allowed and leads to the problem mentioned in the post on the spirit mailing list which I linked to in my OP. OOPS, now I see why reading that post was not clear enough. Maybe this earlier post in same spirit thread would be clearer. http://article.gmane.org/gmane.comp.parsers.spirit.general/17113 In particular, note the phrase: neither can you use variant because variant can't take duplicate types. This can lead to problems in the spirit parser because the attributes of parsing: a | b where: phrase a has an attribute of type A phrase b has an attribute of type B is: variant as noted near the bottom of: http://www.boost.org/doc/libs/1_51_0/libs/spirit/doc/html/spirit/abstracts/attributes/compound_attributes.html and if A and B are the same, then there's a problem because Boost.variant can't handle duplicates. Hope that's clearer. -regards, Larry Thank you for the clarification. Implementing an id seems reasonable feature request for algebraic. I've added a bugzilla request for it: http://d.puremagic.com/issues/show_bug.cgi?id=8962 Please have a look in case I missed anything. BTW, recently there was a review of another Boost library that has some similarity to Boost.any. It's called type_erasure: http://steven_watanabe.users.sourceforge.net/type_erasure/libs/type_erasure/doc/html/index.html Since std.Variant is similar to Boost.any (as noted above), and since Boost.any is, in some ways, like Boost.type_erasure, and since you're working on a revised std.Variant, you might be interested in looking at type_erasure. Thanks, I'll take a look at it.
Re: Immutable and unique in C#
On Fri, 09 Nov 2012 07:53:27 -0600, Sönke Ludwig wrote: Just stumbled over this, which is describing a type system extension for C# for race-free parallelism: http://research.microsoft.com/pubs/170528/msr-tr-2012-79.pdf Independent of this article I think D is currently missing out a lot by omitting a proper unique type (a _proper_ library solution would be a start, but I'm not sure if that can handle all details). It would make a lot of the cases work that are currently simply not practical because of loads of casts that are necessary. What's wrong with std.typecons.Unique? By the way, back when concurrency in D was actively being discussed and developed, (IIRC) Walter did try to implement unique as a proper type in D, but ran into several gotchas. In essence, while we all want unique/mobile, for unique/mobile to be non-broken it also needs at least a lent/scope and an owned type. Ownership type systems are a relatively new area of CS and as demonstrated by the paper, still an active area of research. The thorny issue of these systems is that you need to associate variables with types. For example, this.x = that.y; is only valid if x and y are both owned by the same region. This is easy to verify at runtime, but not at compile-time. Anyways, ownership types (or whatever supersedes them) were pushed back to D3 and we were left with message passing for the common user, synchronized for traditional lock-based shared memory and shared for lock-free programming. P.S. Thanks for the link to the paper.
Re: half datatype?
On Sun, 18 Nov 2012 05:21:27 -0600, Manu wrote: I've often wondered about having an official 'half' type. It's very common in rendering/image processing, supported by most video cards (so compression routines interacting with this type are common), and it's also supported in hardware by some cpu's. ARM for instance supports 'half's in hardware, and GCC has an __fp16 type which would map nicely if D supported the type in the front end. The alternative us to use ushort everywhere, which is awkward, because it is neither unsigned, nor is it an integer, and it's not typesafe (allows direct assignment to ints and stuff)... It would be nice if: cast(half)someFloat would yield the proper value, even if it is performed in software in most architectures, it could be mapped to hardware for those that do it. It could be done in a library, but then GCC couldn't map it properly to the hardware type, and since D has no way to describe implicit casts (that I know of?) it becomes awkward to use. someFloat = someHalf <- doesn't work, because a cast operator expects an explicit cast, even though this is a lossless conversion and should be exactly the same as someDouble = someFloat. Thoughts? Vote--. The a half data type is already part of std.numeric. From the docs: // Define a 16-bit floating point values CustomFloat!16x; // Using the number of bits CustomFloat!(10, 5) y; // Using the precision and exponent width CustomFloat!(10, 5,CustomFloatFlags.ieee) z; // Using the precision, exponent width and format flags CustomFloat!(10, 5,CustomFloatFlags.ieee, 15) w; // Using the precision, exponent width, format flags and exponent offset bias // Use the 16-bit floats mostly like normal numbers w = x*y - 1; writeln(w); // Functions calls require conversion z = sin(+x) + cos(+y); // Use uniary plus to concisely convert to a real z = sin(x.re) + cos(y.re); // Or use the .re property to convert to a real z = sin(x.get!float) + cos(y.get!float);// Or use get!T z = sin(cast(float)x) + cos(cast(float)y); // Or use cast(T) to explicitly convert // Define a 8-bit custom float for storing probabilities alias CustomFloat!(4, 4, CustomFloatFlags.ieee^CustomFloatFlags.probability^CustomFloatFlags.signed ) Probability; auto p = Probability(0.5);
Re: half datatype?
On Sun, 18 Nov 2012 05:21:27 -0600, Manu wrote: I've often wondered about having an official 'half' type. It's very common in rendering/image processing, supported by most video cards (so compression routines interacting with this type are common), and it's also supported in hardware by some cpu's. ARM for instance supports 'half's in hardware, and GCC has an __fp16 type which would map nicely if D supported the type in the front end. The alternative us to use ushort everywhere, which is awkward, because it is neither unsigned, nor is it an integer, and it's not typesafe (allows direct assignment to ints and stuff)... It would be nice if: cast(half)someFloat would yield the proper value, even if it is performed in software in most architectures, it could be mapped to hardware for those that do it. It could be done in a library, but then GCC couldn't map it properly to the hardware type, and since D has no way to describe implicit casts (that I know of?) it becomes awkward to use. someFloat = someHalf <- doesn't work, because a cast operator expects an explicit cast, even though this is a lossless conversion and should be exactly the same as someDouble = someFloat. Thoughts? Vote--. The a half data type is already part of std.numeric. From the docs: // Define a 16-bit floating point values CustomFloat!16x; // Using the number of bits CustomFloat!(10, 5) y; // Using the precision and exponent width CustomFloat!(10, 5,CustomFloatFlags.ieee) z; // Using the precision, exponent width and format flags CustomFloat!(10, 5,CustomFloatFlags.ieee, 15) w; // Using the precision, exponent width, format flags and exponent offset bias // Use the 16-bit floats mostly like normal numbers w = x*y - 1; writeln(w); // Functions calls require conversion z = sin(+x) + cos(+y); // Use uniary plus to concisely convert to a real z = sin(x.re) + cos(y.re); // Or use the .re property to convert to a real z = sin(x.get!float) + cos(y.get!float);// Or use get!T z = sin(cast(float)x) + cos(cast(float)y); // Or use cast(T) to explicitly convert // Define a 8-bit custom float for storing probabilities alias CustomFloat!(4, 4, CustomFloatFlags.ieee^CustomFloatFlags.probability^CustomFloatFlags.signed ) Probability; auto p = Probability(0.5);
Re: Array Operations: a[] + b[] etc.
On Thu, 22 Nov 2012 06:10:04 -0600, John Colvin wrote: On Wednesday, 21 November 2012 at 19:40:25 UTC, Mike Wey wrote: If you want to use this syntax with images, DMagick's ImageView might be interesting: http://dmagick.mikewey.eu/docs/ImageView.html I like it :) From what I can see it provides exactly what i'm talking about for 2D. I haven't looked at the implementation in detail, but do you think that such an approach could be scaled up to arbitrary N-dimensional arrays? Yes and no. Basically, like an array, an ImageView is a thick pointer and as the dimensions increase the pointer gets thicker by 1-2 words a dimension. And each indexing or slicing operation has to create a temporary with this framework, which leads to stack churn as the dimensions get large. An another syntax that can be used until we get true, multi-dimensional slicing is to use opIndex with int[2] arguments, i.e: view[[4,40],[5,50]] = new Color("red");
Re: Array Operations: a[] + b[] etc.
On Thu, 22 Nov 2012 20:06:44 -0600, John Colvin wrote: On Thursday, 22 November 2012 at 21:37:19 UTC, Dmitry Olshansky wrote: Array ops supposed to be overhead-free loops transparently leveraging SIMD parallelism of modern CPUs. No more and no less. It's like auto-vectorization but it's guaranteed and obvious in the form. I disagree that array ops are only for speed. I would argue that their primary significance lies in their ability to make code significantly more readable, and more importantly, writeable. For example, the vector distance between 2 position vectors can be written as: dv[] = v2[] - v1[] or dv = v2[] - v1[] anyone with an understanding of mathematical vectors instantly understands the general intent of the code. With documentation something vaguely like this: "An array is a reference to a chunk of memory that contains a list of data, all of the same type. v[] means the set of elements in the array, while v on it's own refers to just the reference. Operations on sets of elements e.g. dv[] = v2[] - v1[] work element-wise along the arrays {insert mathematical notation and picture of 3 arrays as columns next to each other etc.}. Array operations can be very fast, as they are sometimes lowered directly to cpu vector instructions. However, be aware of situations where a new array has to be created implicitly, e.g. dv = v2[] - v1[]; Let's look at what this really means: we are asking for dv to be set to refer to the vector difference between v2 and v1. Note we said nothing about the current elements of dv, it might not even have any! This means we need to put the result of v2[] - v1] in a new chunk of memory, which we then set dv to refer to. Allocating new memory takes time, potentially taking a lot longer than the array operation itself, so if you can, avoid it!", anyone with the most basic programming and mathematical knowledge can write concise code operating on arrays, taking advantage of the potential speedups while being aware of the pitfalls. In short: Vector syntax/array ops is/are great. Concise code that's easy to read and write. They fulfill one of the guiding principles of D: the most obvious code is fast and safe (or if not 100% safe, at least not too error-prone). More vector syntax capabilities please! While I think implicit allocation is a good idea in the case of variable initialization, i.e.: auto dv = v2[] - v1[]; however, as a general statement, i.e. dv = v2[] - v1[];, it could just as easily be a typo and result in a silent and hard to find performance bug. // An alternative syntax for variable initialization by an array operation expression: auto dv[] = v2[] - v1[];
Re: Array Slices and Interior Pointers
On Tue, 11 Dec 2012 11:25:44 -0600, Alex Rønne Petersen wrote: On 11-12-2012 08:29, Rainer Schuetze wrote: On 11.12.2012 01:04, Alex Rønne Petersen wrote: http://xtzgzorex.wordpress.com/2012/12/11/array-slices-and-interior-pointers/ Destroy. Done. [snip] From what I could find in e.g. the Boehm GC, there seems to be significant work done to catch interior pointers in addition to base pointers (grep for GC_all_interior_pointers and related symbols). *Ahem* Arguments regarding performance require A) hard numbers and B) are implementation specific. [snip] Suppose we have a field int* p; p _isn't_ a slice, so you're 'fixes' don't apply. [snip] So we have to look at the pointer and first figure out what kind of memory block it is /actually/ pointing to before we have any kind of type info available (just the knowledge that it's of type int* is not particularly useful by itself other than knowing that it could be a pointer at all). How is p >> 12 slow or difficult? (Assuming log2(PageSize) == 12) So the TL;DR is: We avoid extra work to figure out the actual type of the memory something is pointing to by simply making such cases illegal. At the cost of extra work and more memory everywhere arrays are used. Whether that is practical, I do not know, and I don't plan to push for it anytime soon at least. But it has to be done for D to ever run on the CLI. The issue with the CLI has nothing to do with this. The problem is that D arrays are D slices (i.e. we don't have T[new]) and D code is written to be slice compatible. Whereas the .Net libraries are, for the most part, slice incompatible. So slice-based code, in D or .Net, has to constantly convert back to arrays, which is a major performance sink. [snip] But if we make this assumption, D can never run on the CLI. False, see http://dnet.codeplex.com/. Interior pointers are OK in the stack and registers, so taking pointers to fields inside aggregates should be fine so long as they are not stored in the heap. So what about unions?
Re: add phobos module std.halffloat ?
On Wed, 19 Dec 2012 09:35:39 -0600, Andrei Alexandrescu wrote: On 12/19/12 2:30 AM, Walter Bright wrote: https://github.com/D-Programming-Language/phobos/pull/1018/files Shouldn't it be part of std.numeric? Related, we should have a decision point "this must go through the review process" vs. "the pull review process is sufficient". New modules definitely must go through the review process, as should large additions to existing models. Andrei It _IS_ part of std.numeric! Specifically std.numeric.CustomFloat!16; To quote our own documentation: // Define a 16-bit floating point values CustomFloat!16x; // Using the number of bits CustomFloat!(10, 5) y; // Using the precision and exponent width CustomFloat!(10, 5,CustomFloatFlags.ieee) z; // Using the precision, exponent width and format flags CustomFloat!(10, 5,CustomFloatFlags.ieee, 15) w; // Using the precision, exponent width, format flags and exponent offset bias // Use the 16-bit floats mostly like normal numbers w = x*y - 1; writeln(w); // Functions calls require conversion z = sin(+x) + cos(+y); // Use uniary plus to concisely convert to a real z = sin(x.re) + cos(y.re); // Or use the .re property to convert to a real z = sin(x.get!float) + cos(y.get!float);// Or use get!T z = sin(cast(float)x) + cos(cast(float)y); // Or use cast(T) to explicitly convert --- The only item missing from std.numeric.CustomFloat is to add an alias this line so that implicit conversion is supported. Oh, and documentation of the standard float properties, such as infinity/min/min/etc, which exist in the code, but are template methods and so don't appear in the ddoc.
Re: add phobos module std.halffloat ?
On Wed, 19 Dec 2012 13:48:29 -0600, Walter Bright wrote: On 12/19/2012 2:22 AM, ponce wrote: On Wednesday, 19 December 2012 at 07:30:54 UTC, Walter Bright wrote: https://github.com/D-Programming-Language/phobos/pull/1018/files If that ever helps, I implemented this fast conversion ftp://www.fox-toolkit.org/pub/fasthalffloatconversion.pdf for a half wrapper here: https://github.com/p0nce/gfm/blob/master/math/half.d It's woefully untested though. In 3D graphics efficiently converting float to half can speed up vertex submission. The pdf file shows as corrupted. It opens fine for me. (Win7/Adobe Reader 10.0.1)
Re: add phobos module std.halffloat ?
On Wed, 19 Dec 2012 05:47:38 -0600, Iain Buclaw wrote: On 19 December 2012 11:30, tn wrote: On Wednesday, 19 December 2012 at 10:13:56 UTC, Iain Buclaw wrote: On 19 December 2012 08:55, Walter Bright wrote: On 12/19/2012 12:47 AM, Alex Rønne Petersen wrote: On 19-12-2012 08:35, Jacob Carlborg wrote: On 2012-12-19 08:30, Walter Bright wrote: [snip] What is the difference between std.numeric.CustomFloat and this? With this, there's a choice of rounding modes, casting between float and integral types, and the fact that not many people know about it? *Ahem* That should be possible future support for rounding modes. Currently, the code paths exist but are restricted to ROUND.TONEAREST. I'm not sure why this is given std.math.FloatingPointControl.rounding.
Re: add phobos module std.halffloat ?
On Wed, 19 Dec 2012 04:49:59 -0600, d coder wrote: On Wed, Dec 19, 2012 at 3:43 PM, Iain Buclaw wrote: How difficult would you think it would be to scale down (or up) this library type so it can be an emulated IEEE type of any size? (The whole shebang eg: quarter, half, single, double, quad, double-quad, 80bit and 96-bit). Just interested as I think that a module which implements an IEEE floating point type that produces constant results cross-platform would be better than a dedicated module just for half float types. +1 See std.numeric.CustomFloat. It supports quarter, half, single, double and 80-bit. The wikipedia article on the floating point (IEEE 754) includes a Quad type but not a double-quad or 96-bit type. 96-bit Should be possible, as it's just 80-bits with padding. Quad and double quad would be tougher to do, do to the lack of a quad integer type or a 112-bit bit shift operator.
Re: A thought about garbage collection
On Wed, 19 Dec 2012 16:58:59 -0600, bearophile wrote: Benjamin Thaut: http://sebastiansylvan.wordpress.com/2012/12/01/garbage-collection-thoughts/ It seems a nice blog post. I don't remember what D does regarding per-thread heaps. It's been proposed and (sadly) rejected. Maybe Phobos emplace() is not enough and a bigger/better gun is needed to safely allocate in-place a higher percentage of class instances. It would be fairly easy to build a library function(i.e. Scope!T or InPlace!T) that uses emplace to store a class inside of a wrapper struct.
Re: add phobos module std.halffloat ?
On Thu, 20 Dec 2012 00:06:04 -0600, Rob T wrote: On Wednesday, 19 December 2012 at 20:44:19 UTC, Robert Jacques wrote: See std.numeric.CustomFloat. It supports quarter, half, single, double and 80-bit. The wikipedia article on the floating point (IEEE 754) includes a Quad type but not a double-quad or 96-bit type. 96-bit Should be possible, as it's just 80-bits with padding. Quad and double quad would be tougher to do, do to the lack of a quad integer type or a 112-bit bit shift operator. I did not see anything in the reference docs to indicate consistency across platform. Do these produce constant results cross-platform? Actually do any of the basic floating point types do this (Real excluded)? --rt CustomFloat is for storage purposes only. All operations are prefixed by converting the custom float to real first, which is what x86 does internally. This is in the Ddoc comments in the code, but because CustomFloat is a template, they don't appear in the html docs. As for consistent results, you do get a consistent storage size, which you don't get with real, but aside from that, IEEE only specifies a minimum accuracy for each operation. Combined with truncation issues, this means that results will very from platform to platform and from compiler to compiler. There is nothing new about this. In fact, x86 is more precise than x86-64 due to the switch from the x87 to SSE for floats/doubles.
Re: Simple features that I've always missed from C...
On Mon, 17 Oct 2011 16:53:42 -0400, Manu wrote: [snip] *Count leading/trailing zeroes:* I don't know of any even slightly recent architecture that doesn't have opcodes to count loading/trailing zeroes, although they do exist, so perhaps this is a little dubious. I'm sure this could be emulated for such architectures, but it might be unreasonably slow if used... D has this: check out std.intrinsic's bsr and bsl.
Re: To share or not to share, this is the question.
On Mon, 17 Oct 2011 08:28:08 -0400, Gor Gyolchanyan wrote: I don't get it. HWND is an alias for void*. void* is data. what do you mean, "alias is not data"? void* is a type, not a member field of the class (data)
Re: Introspection/Reflection/etc on Linux
On Tue, 18 Oct 2011 12:48:47 -0400, J Arrizza wrote: I'm trying to write some sample code to: 1 create an object via it's name 2 search an object for it's functions 3 call a static function in a class 4 call a non-static function in an object #1, #2 and #3 were straightforward. But #4 looks like it's not possible in DMD 2.0. I found this: http://www.digitalmars.com/d/archives/digitalmars/D/announce/Runtime_reflection_9949.html which implies #4 is doable in Windows but not Linux. I'm using Linux 64-bit Ubuntu. Any other avenues I can chase? TIA, John You could try out my improved variant implementation: https://jshare.johnshopkins.edu/rjacque2/public_html/variant.mht https://jshare.johnshopkins.edu/rjacque2/public_html/variant.d Example: import variant; class Foo { int x; } void main(string[] args) { Variant.__register!Foo; Variant var = Object.factory( typeid(Foo).toString ); var.__reflect("x",Variant(10)); // or explicitly assert(var.__reflect("x") == 10); return; }
Re: __restrict
On Wed, 19 Oct 2011 07:58:15 -0400, Manu wrote: I agree, that is certainly more important :) I'm mainly just curious to know about how the main contributors feel about these things, and whether these things will be implemented/planned, or if they violate some fundamental language principles... Basically, I really want to start some major work in D, but before investing into the language, I want to know that important features are recognised and have a long term plan... Well, __restrict was mainly added to C (IIRC), to allow loop vectorization. In D, we have explicit array operations, which carry a lot of the same caveats as __restrict, except are checkable. I'm curious to know what you mean by "maybe also because in D where possible we prefer things that the compiler is able to verify/enforce". I'm not sure how that really applies to __restrict. It's effectively an optimisation hint to the compiler, giving it explicit instructions... what would there be to verify or enforce in this case? That's the point, the compiler can't verify or enforce the assumptions of __restrict. Think of array indexing in C. The compiler has no way of verifying or enforcing that ptr[10_000] is a valid memory location. And this has lead to undefined behavior and a large number of security exploits. In D, the arrays always carry around their lengths, so indexes can be checked. No more undefined behavior and less exploits. Similarly, C's const provides no actual guarantees. But compilers used it to optimize code anyways, which has lead many a programmer to long hours debugging. Is it that it would be preferred if __restrict-ability could be implied by carefully crafted language rules? Yes, and it is for array operations. I just don't think that's possible.. But it's still an important keyword. How difficult is it to contribute to D in these areas? Is that something that is encouraged, or is D still too embryonic to have random people coming along and adding things here and there? It's very easy to contribute. The source code for DMD, LDC, GDC is all available and patches are regularly submitted and accepted. Check out D's github repositories and the mailing lists for more. On 19 October 2011 12:36, bearophile wrote: Manu: > I sent an email about this once before... but there was no real > response/discussion on the topic. It was discussed a bit in past, and restrict was not appreciated a lot, maybe also because in D where possible we prefer things that the compiler is able to verify/enforce. And I think D/DMD is not yet in a development stage where it cares for max performance details. I think there are plenty of more important things to work on before that. The recently almost-fixed "inout" was more urgent than "__restrict". Bye, bearophile
Re: __restrict
On Wed, 19 Oct 2011 17:10:55 -0400, Peter Alexander wrote: On 19/10/11 3:08 PM, Robert Jacques wrote: On Wed, 19 Oct 2011 07:58:15 -0400, Manu wrote: I agree, that is certainly more important :) I'm mainly just curious to know about how the main contributors feel about these things, and whether these things will be implemented/planned, or if they violate some fundamental language principles... Basically, I really want to start some major work in D, but before investing into the language, I want to know that important features are recognised and have a long term plan... Well, __restrict was mainly added to C (IIRC), to allow loop vectorization. In D, we have explicit array operations, which carry a lot of the same caveats as __restrict, except are checkable. It's for far more than vectorization. Any place that has redundant loads can benefit from __restrict. I recommend this presentation on the subject: http://www.slideshare.net/guest3eed30/memory-optimization It's about memory optimizations. Aliasing issues start at slide 35. Thanks. I've seen most of that before. And even in those slide, the major (though not only) use-case for __restrict is with array and matrix processing. Most other use-cases can be cached manually or are really, really unsafe.
Re: Shared Delegates
On Wed, 19 Oct 2011 20:51:25 -0400, Michel Fortin wrote: On 2011-10-19 21:53:12 +, Andrew Wiley said: [snip] Also, I how shared works is being misunderstood. Making a class synchronized should limit all member functions and field to being shared or immutable, but it doesn't place any limits on the argument to the member functions. So in the below: synchronized class Thing2 { void doSomeWork(int i) {} void doSomeOtherWork(Thing2 t) {} void work() {} } i is of type int, not shared(int).
Re: to!() conversion between objects
On Wed, 19 Oct 2011 14:31:20 -0400, Piotr Szturmaj wrote: Piotr Szturmaj: I have written a simple conversion template for tuples, structs and classes: This is only the part to complement universal range/array to tuple/struct/class conversion. It may be useful in mapping runtime fields like database rows or CSV lines onto objects. I think Phobos needs a common solution for that. My proposed improvement to std.variant handles duck-typing style conversions between values.
Re: to!() conversion between objects
On Wed, 19 Oct 2011 14:16:31 -0400, Piotr Szturmaj wrote: bearophile wrote: Piotr Szturmaj: I have written a simple conversion template for tuples, structs and classes: Do you have some use case to show me? class C { int i; string s; } struct S { string s; float f; } auto c = to!C(S("5", 2.5f)); assert(c.i == 5 && c.s == "2.5"); So C's s field maps to S's f field, not it's s field? That seems unintuitive and bug prone.
Re: To share or not to share, this is the question.
On Wed, 19 Oct 2011 02:06:06 -0400, Gor Gyolchanyan wrote: I meant, i have a member of type HWND. On Wed, Oct 19, 2011 at 6:47 AM, Robert Jacques wrote: On Mon, 17 Oct 2011 08:28:08 -0400, Gor Gyolchanyan wrote: I don't get it. HWND is an alias for void*. void* is data. what do you mean, "alias is not data"? void* is a type, not a member field of the class (data) And shared class T { alias void* v; v v2; static assert(is(typeof(v2) == shared)); } compiles so what's your problem?
Re: sqrt(2) must go
On Wed, 19 Oct 2011 22:52:14 -0400, Marco Leise wrote: Am 20.10.2011, 02:46 Uhr, schrieb dsimcha : On 10/19/2011 6:25 PM, Alvaro wrote: El 19/10/2011 20:12, dsimcha escribió: == Quote from Don (nos...@nospam.com)'s article The hack must go. No. Something as simple as sqrt(2) must work at all costs, period. A language that adds a bunch of silly complications to something this simple is fundamentally broken. I don't remember your post on implicit preferred conversions, but IMHO implicit conversions of integer to double is a no-brainer. Requiring something this simple to be explicit is Java/Pascal-like overkill on explicitness. Completely agree. I call that uncluttered programming. No excessive explicitness should be necessary when what you mean is obvious (under some simple conventions). Leads to clearer code. Yes, and for the most part uncluttered programming is one of D's biggest strengths. Let's not ruin it by complicating sqrt(2). What is the compiler to do with sqrt(5_000_000_000) ? It doesn't fit into an int, but it fits into a double. Simple, is a 5_000_000_000 long, and longs convert to reals. Also, 5_000_000_000 does not fit, exactly inside a double.
Re: sqrt(2) must go
On Wed, 19 Oct 2011 22:57:48 -0400, Robert Jacques wrote: On Wed, 19 Oct 2011 22:52:14 -0400, Marco Leise wrote: Am 20.10.2011, 02:46 Uhr, schrieb dsimcha : On 10/19/2011 6:25 PM, Alvaro wrote: El 19/10/2011 20:12, dsimcha escribió: == Quote from Don (nos...@nospam.com)'s article The hack must go. No. Something as simple as sqrt(2) must work at all costs, period. A language that adds a bunch of silly complications to something this simple is fundamentally broken. I don't remember your post on implicit preferred conversions, but IMHO implicit conversions of integer to double is a no-brainer. Requiring something this simple to be explicit is Java/Pascal-like overkill on explicitness. Completely agree. I call that uncluttered programming. No excessive explicitness should be necessary when what you mean is obvious (under some simple conventions). Leads to clearer code. Yes, and for the most part uncluttered programming is one of D's biggest strengths. Let's not ruin it by complicating sqrt(2). What is the compiler to do with sqrt(5_000_000_000) ? It doesn't fit into an int, but it fits into a double. Simple, is a 5_000_000_000 long, and longs convert to reals. Also, 5_000_000_000 does not fit, exactly inside a double. Opps. That should be '5_000_000_000 is a long' not ' is a 5_000_000_000 long'
Re: sqrt(2) must go
On Wed, 19 Oct 2011 23:01:34 -0400, Steven Schveighoffer wrote: On Wed, 19 Oct 2011 22:57:48 -0400, Robert Jacques wrote: On Wed, 19 Oct 2011 22:52:14 -0400, Marco Leise wrote: Am 20.10.2011, 02:46 Uhr, schrieb dsimcha : On 10/19/2011 6:25 PM, Alvaro wrote: El 19/10/2011 20:12, dsimcha escribió: == Quote from Don (nos...@nospam.com)'s article The hack must go. No. Something as simple as sqrt(2) must work at all costs, period. A language that adds a bunch of silly complications to something this simple is fundamentally broken. I don't remember your post on implicit preferred conversions, but IMHO implicit conversions of integer to double is a no-brainer. Requiring something this simple to be explicit is Java/Pascal-like overkill on explicitness. Completely agree. I call that uncluttered programming. No excessive explicitness should be necessary when what you mean is obvious (under some simple conventions). Leads to clearer code. Yes, and for the most part uncluttered programming is one of D's biggest strengths. Let's not ruin it by complicating sqrt(2). What is the compiler to do with sqrt(5_000_000_000) ? It doesn't fit into an int, but it fits into a double. Simple, is a 5_000_000_000 long, and longs convert to reals. Also, 5_000_000_000 does not fit, exactly inside a double. It doesn't? I thought double could do 53 bits? Yes. You're right. Sorry, my brain automatically skipped forward to 5_000_000_000 => long => real. Although I agree, long should map to real, because obviously not all longs fit into a double exactly. -Steve
Re: sqrt(2) must go
On Thu, 20 Oct 2011 09:11:27 -0400, Don wrote: [snip] I'd like to get to the situation where those overloads can be added without breaking peoples code. The draconian possibility is to disallow them in all cases: integer types never match floating point function parameters. The second possibility is to introduce a tie-breaker rule: when there's an ambiguity, choose double. And a third possibility is to only apply that tie-breaker rule to literals. And the fourth possibility is to keep the language as it is now, and allow code to break when overloads get added. The one I really, really don't want, is the situation we have now: #5: whenever an overload gets added, introduce a hack for that function... I agree that #5 and #4 not acceptable longer term solutions. I do CUDA/GPU programming, so I live in a world of floats and ints. So changing the rules does worries me, but mainly because most people don't use floats on a daily basis, which introduces bias into the discussion. Thinking it over, here are my suggestions, though I'm not sure if 2a or 2b would be best: 1) Integer literals and expressions should use range propagation to use the thinnest loss-less conversion. If no loss-less conversion exists, then an error is raised. Choosing double as a default is always the wrong choice for GPUs and most embedded systems. 2a) Lossy variable conversions are disallowed. 2b) Lossy variable conversions undergo bounds checking when asserts are turned on. The idea behind 2b) would be: int i = 1; float f = i; // assert(true); i = int.max; f = i; // assert(false);
Re: __restrict
On Thu, 20 Oct 2011 07:37:22 -0400, Manu wrote: Caching results manually is very tedious work, and makes a mess of your code.. If you've had to do that yourself you'd understand how annoying it can be. I do do it all the time and I've never found it that tedious. Especially compared to all the loop invariant divisions, expressions, function calls, etc you have to excise from the loop anyways... I mean, L1 cache has a latency of ~ 1 cycle, division is ~50-150 cycles. Placing restrict on sensible pointers is much cleaner and saves a lot of time, and also results in optimisation everywhere it is used, rather than just the 1-2 places that you happened to notice it was a problem... Except that you should only use __restrict where you notice a problem. In case you didn't know, C++ compilers do (did?) all the __restrict optimizations for const variables. And the systematic use of const on these compilers is a well known source of Heisen-bugs. I haven't heard that this 'feature' has been disabled, but its been a while since I checked.
Re: __restrict
On Fri, 21 Oct 2011 02:19:03 -0400, Manu wrote: Naturally, as with all my posts, I'm not referring to x86 :) Naturally, stating that upfront would drastically improve our understanding and weighting of your arguments. :) L1 is rarely ~1 cycle access, there are even a few architectures that can't write to L1 at all, And I work on GPU, so I can understand that pain. Then again, I'm so much more conscious of my code in those situations. For example, the following are not all created equal: for(uint i = 0;i < length; i++) for( int i = 0;i < length; i++) for(uint i = length-1; i >= 0; i--) for( int i = length-1; i >= 0; i--) and I've never come in contact with a compiler that can do anything useful with the const keyword in C. That said __restrict is fundamentally different than const, const suggests I can't change the memory pointed at, when that is often exactly what I intend to do. My point, was that there are C++ compilers that do do things (not) useful with const, that are in principal the exact same things that would happen to a __restrict object. And it caused lots of hard to find bugs. So I feel safe in saying that usage of __restrict should never be wide spread. It should only be used in well understood and controlled code hot spots. It seems to be very hard to convince people who have never had to work on these platforms that it's really important :/ One of the best ways to do that it to prove it with numbers, with real code that a consciousness embedded developer could be expected to write.
Re: sqrt(2) must go
On Fri, 21 Oct 2011 09:00:48 -0400, Manu wrote: On 21 October 2011 10:53, Manu wrote: On 21 October 2011 09:00, Don wrote: [snip] 1: Seems reasonable for literals; "Integer literals and expressions should use range propagation to use the thinnest loss-less conversion"... but can you clarify what you mean by 'expressions'? I assume we're talking strictly literal expressions? Consider sqrt(i % 10). No matter what i is, the range of i % 10 is 0-9. I was more thinking of whether plain old assignment would be allowed: float f = myshort; Of course, if we deny implicit conversion, shouldn't the following fail to compile? float position = index * resolution; 2b: Does runtime bounds checking actually addresses the question; which of an ambiguous function to choose? If I read you correctly, 2b suggests bounds checking the implicit cast for data loss at runtime, but which to choose? float/double/real? We'll still arguing that question even with this proposal taken into consideration... :/ Perhaps I missed something? Yes, nut only because I didn't include it. I was thinking of float f = i; as opposed to func(i) for some reason. Bounds checking would only make sense if func(float) was the only overload. Naturally all this complexity assumes we go with the tie-breaker approach, which I'm becoming more and more convinced is a bad plan... Then again, with regards to 1, the function chosen will depend on the magnitude of the int, perhaps a foreign constant, you might not clearly be able to know which one is called... What if the ambiguous overloads don't actually perform identical functionality with just different precision? .. Then whoever wrote the library was Evil(tm). Given that these rules wouldn't interfere with function hijacking, I'm not sure of the practicality of this concern. Do you have an example? I don't like the idea of it being uncertain. And one more thing to ponder, is the return type telling here? float x = sqrt(2); Obviously this may only work for these pure maths functions where the return type is matched to the args, but maybe it's an element worth considering. ie, if the function parameter is ambiguous, check for disambiguation via the return type...? Sounds pretty nasty! :)
Re: Compiler patch for runtime reflection
On Fri, 21 Oct 2011 17:15:02 -0400, Alex Rønne Petersen wrote: On 21-10-2011 21:07, Vladimir Panteleev wrote: Hi, Igor Stepanov has created a patch for DMD and Druntime which adds RTTI information for class and struct members. Example: import std.stdio; class Foo { static void PrintHello() { writeln("Hello"); } } void main() { auto info = cast(OffsetTypeInfo_StaticMethod)Foo.classinfo.m_offTi[0]; assert(info.name == "PrintHello"); auto print = cast(void function())info.pointer; print(); //prints "Hello" } Absolutely awesome! While the inclusion of such functionality into the language remains a disputed matter, would anyone be interested in an unofficial patch for this? Yes, very much. I would recommend setting up a fork on GitHub, and then adding it to a branch, e.g. 'reflection', if it's not going to be included in mainline DMD. Really? A runtime reflection system has been on the review queue for over six months and I've brought the subject up on the newsgroup. I've received zero feedback. So while I'm sure everyone want to check RTTI off the D features list, I've not seen much real interest in it.
Re: Compiler patch for runtime reflection
On Fri, 21 Oct 2011 17:23:17 -0400, Daniel Gibson wrote: Am 21.10.2011 21:07, schrieb Vladimir Panteleev: Hi, Igor Stepanov has created a patch for DMD and Druntime which adds RTTI information for class and struct members. Example: import std.stdio; class Foo { static void PrintHello() { writeln("Hello"); } } void main() { auto info = cast(OffsetTypeInfo_StaticMethod)Foo.classinfo.m_offTi[0]; assert(info.name == "PrintHello"); auto print = cast(void function())info.pointer; print(); //prints "Hello" } While the inclusion of such functionality into the language remains a disputed matter, would anyone be interested in an unofficial patch for this? Walter: would it be okay if the compiler changes were published as a GitHub fork, or should we stick to patches? I'd love to see proper runtime reflection support in D, including functionality to get information about available methods (their name and parameters) and a way to call them. What do you mean by their 'parameters'? What about overloads? Attributes? Arguments? Argument attributes? Something that is close to what Java offers would be great. And what, exactly does JAVA offer? What works? What doesn't work? What's missing? BTW: I don't really see the problem with providing this information (overhead-wise) - the information needs to be available once per class/struct, but objects of classes just need one pointer to it (other types don't even need that because they're not polymorphic and - like methods of structs - the address of the information is known at compile-time). 1) Unused information is simply bloat: it increases exe size, slows the exe down and increases the runtime memory footprint. 2) On a lot of systems (i.e. consoles, embedded, smart phones, tablets) memory and disk space are both highly constrained resources that you don't want to waste. 3) RTTI provides a back-door into a code-base; one that for many reasons you may want to keep closed.
Re: sqrt(2) must go
On Fri, 21 Oct 2011 19:04:43 -0400, Manu wrote: It would still allow function hijacking. void func(double v); exists... func(2); then someone comes along and adds func(float v); .. It will now hijack the call. That's what you mean right? Hijacking is what happends when someone adds func(float v); _in another module_. And that hijack would/should still be detected, etc. like any other hijack.
Re: sqrt(2) must go
On Sat, 22 Oct 2011 05:42:10 -0400, Manu wrote: Sure, and hijacking is bound to happen under your proposal, no? How would it be detected? On 22 October 2011 06:51, Robert Jacques wrote: On Fri, 21 Oct 2011 19:04:43 -0400, Manu wrote: It would still allow function hijacking. void func(double v); exists... func(2); then someone comes along and adds func(float v); .. It will now hijack the call. That's what you mean right? Hijacking is what happends when someone adds func(float v); _in another module_. And that hijack would/should still be detected, etc. like any other hijack. Manu, I'm not sure you understand how function hijack detection works today. Let us say you have three modules module a; float func(float v) { return v; } module b; double func(double v) { return v; } module c; int func(int v) { return v*v; } which all define a func method. Now, if you import a; import b; void main(string[] args) { assert(func(1.0f) == 1.0f); // Error } you'll get a function hijacking error because func(1.0f) matches func(float) and func(double). However, if you instead: import a; import c; void main(string[] args) { assert(func(1.0f) == 1.0f); // Error } you won't get an error, because func(1.0f) doesn't match func(int). In short, the best overload is only selected _after_ the module name has been resolved. The proposal of myself and others only affects which overload is the best match; it has no possible effect on function hijacking.
Re: Compiler patch for runtime reflection
On Sat, 22 Oct 2011 07:57:17 -0400, Daniel Gibson wrote: Am 22.10.2011 05:48, schrieb Robert Jacques: [snip] What do you mean by their 'parameters'? What about overloads? Attributes? Arguments? Argument attributes? Primarily arguments. That should help identifying overloads. But attributes and argument attributes are needed as well. But handling overloads is the responsibility of the dispatch system. What would be the use cases for this information? Something that is close to what Java offers would be great. And what, exactly does JAVA offer? What works? What doesn't work? What's missing? See http://download.oracle.com/javase/6/docs/api/java/lang/Class.html You can access constructors (look for one by it's parameters or get all of them), the same for methods (methods the class declares and all methods, i.e. also inherited ones), implemented interfaces, fields, ... Thanks, I'll have a look see. BTW: I don't really see the problem with providing this information (overhead-wise) - the information needs to be available once per class/struct, but objects of classes just need one pointer to it (other types don't even need that because they're not polymorphic and - like methods of structs - the address of the information is known at compile-time). 1) Unused information is simply bloat: it increases exe size, slows the exe down and increases the runtime memory footprint. Yeah, but (mostly) only per class, not per object (besides one pointer per object, which shouldn't be that bad and something like this seems is already there for the existing typeinfo) I was assuming the per object bloat was zero. Oh, and I did forget that RTTI increases compile times.
Re: Compiler patch for runtime reflection
On Sat, 22 Oct 2011 06:51:00 -0400, Alex Rønne Petersen wrote: On 22-10-2011 05:36, Robert Jacques wrote: On Fri, 21 Oct 2011 17:15:02 -0400, Alex Rønne Petersen wrote: On 21-10-2011 21:07, Vladimir Panteleev wrote: Hi, Igor Stepanov has created a patch for DMD and Druntime which adds RTTI information for class and struct members. Example: import std.stdio; class Foo { static void PrintHello() { writeln("Hello"); } } void main() { auto info = cast(OffsetTypeInfo_StaticMethod)Foo.classinfo.m_offTi[0]; assert(info.name == "PrintHello"); auto print = cast(void function())info.pointer; print(); //prints "Hello" } Absolutely awesome! While the inclusion of such functionality into the language remains a disputed matter, would anyone be interested in an unofficial patch for this? Yes, very much. I would recommend setting up a fork on GitHub, and then adding it to a branch, e.g. 'reflection', if it's not going to be included in mainline DMD. Really? A runtime reflection system has been on the review queue for over six months and I've brought the subject up on the newsgroup. I've received zero feedback. So while I'm sure everyone want to check RTTI off the D features list, I've not seen much real interest in it. I must have missed that. What should I search for to find your thread? Don't bother; As I said, the thread went nowhere. However, if you'd like to look at the code / docs for my proposed improved variant module: https://jshare.johnshopkins.edu/rjacque2/public_html/variant.mht https://jshare.johnshopkins.edu/rjacque2/public_html/variant.d
Re: Compiler patch for runtime reflection
On Sat, 22 Oct 2011 14:38:22 -0400, foobar wrote: [snip] IMHO, RTTI should be a built-in feature of the language. 1) You need one global system agreed upon by all clients. Otherwise, you could end up in a situation where a single class has more than a single set of metadata used in several different scenarios. Or a singleton class/struct. Yes, you should have a single global system, but nothing says that the system has to be the language. Library solution are also valid. 2) This information is required anyway in the run-time itself. This is a requirement for an accurate GC. a) Not every user of D wants to use a GC. In fact, there's currently a push to remove / improve all the GC mandatory features of D to support ref-counting / manual memory usage. b) RTTI is separate for GC BlockInfo. Yes, you could generate the GC_BlockInfo from the RTTI, but doing so would be so inefficient that its not worth contemplating. 3) Other tools might use this info as well, debuggers, IDEs, etc. Debuggers and IDEs already have all this information, and a lot more. 4) It's with the spirit of D design - make the common/safe case the default. Since when has reflection ever been the common/safe case? .tupleof, despite being a god-send, is one of the most unsafe features of D. My personal impression of reflection is that it's one of those features that you only want 1% of the time, but if you don't have it, you're going to be pulling teeth to work around it. As others mentioned, there should be a compiler switch to turn this off. E.g. when developing for an embedded system where memory is scarce it makes sense to not generate RTTI. It also makes sense to disable the GC and preallocate the memory. Isn't this statement and statement 4) in conflict with each other? Unlike a GC, turning off RTTI breaks everything that uses it.