Re: Patches, bottlenecks, OpenSource
bearophile wrote: After the last posts about patches, I can write something myself about this topic :-) I am far from being an expert of big software projects, but I think I am now able to understand some things of the D project. I like D, it's an alive project, but from what I've seen D so far is not having a so significant success. I think lot of people (especially young programmers) don't want "a better C++", they want no stinking C++ at all. They want something quite different. (On the other hand now I am not so sure Scala will have a widespread success, its type system makes it not easy to learn). A C++-class language compiler is a big project that requires quite different skill sets: in the D compiler there is work to do on the vector operations, lent/uniqueness, multi-core, 64 bit, work for keep it well updated on Mac/Linux/Win and well packaged, development of Phobos and its data structures, work on the back-end to use the SSE registers and future CPUs or GPUs, tuning of the associative arrays, future improvements of the core language, improvements in the unit test system, large improvement in the D web site and the documentation, big improvements needed for the GC to turn it into something more modern, and so on. Such things have to be refined, debugged, efficient, cared of, polished. So as D develops and grows it will get hard for a single person to write all patches to the language and other parts. As D grows this will become a bottleneck, and from what I've seen it already is. The disadvantage of allowing other people the rights to patch the front-end is some lost control, but Walter can review patches after they are already applied & working (in a big project having a director is positive). This can speed up the patching process itself. The compiler can become more like a patchwork of different programming styles, and Walter has to ask to other people how some parts not written by him work. This can be doable if the front-end becomes more modular. LLVM shows a more modular design that can be copied. This is a lot of refractoring work. Increasing openess can increase the influx of devs and their willingness to help the project. This is quite positive and its importance can't be underestimated. Pyrex (http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/ ) was forked in Cython (http://www.cython.org/ ) because Pyrex author was too much slow in patching. A fork will not happen in D soon because there are not enough people yet that care and work for D enough to make a fork viable. D is not a project as alive as Pyrex. Cython is a tiny rib of the Python language circus. I suggest to give patching rights to Don, it can be one more step forward toward Open Sourcing D development style :-) In future such right can be given to other few good people that have shown to deserve enough trust. I am not sure the future of D is in the dmd back-end, llvm looks in better shape for this purpose. LLVM is able to support 99.9% of the features of D2 (and I presume llvm devs are willing to add the few missing things if someone gives them patches; from the unladen-swallow project I have seen they are open enough), and it supports a large number of good features currently not done/present by/in the back-end of dmd. Currently ldc is a compiler much better than dmd (try it if you don't believe me!), but its devs are not appreciating D2 language yet much. I don't like to see the Walter-locomotive so detached from the present nice ldc D1 compiler, and from the future of LDC and llvm. I'd still like ldc to become the official D2 compiler, developed and cared of :-) Bye, bearophile It's all true. I also think we should use a DVCS too so that Walter can apply patches easily and we can upload patch branches directly to somewhere that can be merged and or reconciled by the original authors in case of conflict. It's all good. The history of the patch itself can also be tracked even after merging. I suggest Bazaar but I am deeply biased on the matter. Further, Don et al. could act as gatekeeper to prevent Walter getting bogged down with too many patches. And then things will be all good and I may find the time to get interested in D again.
Re: Is it time to deprecate COM compatibility through D interfaces?
On Tue, 13 Apr 2010 19:29:24 -0400, Steven Schveighoffer wrote: Given that structs have become extremely powerful, and with the advent of opDispatch, would it be possible to deprecate supporting COM via D interfaces in favor of a library solution? There are some crappy drawbacks for having interface be dual-purposed: - Although 99.9% of interfaces are actually instances of Object, you can't call Object functions directly on an interface. This includes opEquals, opCmp, toString and friends. - They are not implicitly castable to Object. - There is no centralized base interface, so there is no argument type you can use that can accept any interface. For instance, if you wanted to do some runtime reflection to determine an interface's methods or something. - All these drawbacks are absolutely pointless on non-Microsoft OSes. We are in the process of getting rid of builtin complex types in favor of library types -- can we do something similar? I admit I am not familiar with COM, except my few experiences with it I hated :) Does anyone actually use D interfaces to represent COM objects? What would the drawbacks be of interfacing COM objects via compile-time generated structs that do the busy work? Another possibility is simply making normal interfaces derive from Object, and COM interfaces not. I think this has been discussed before. -Steve Sorry Steve. I just read the messages about opCmp and interfaces, and you proposed something similar to what I wrote in my previous message. Well, at least it indicates that it's something that surfaces frequently and more than one have though about that :)
Re: Is it time to deprecate COM compatibility through D interfaces?
On Tue, 13 Apr 2010 19:29:24 -0400, Steven Schveighoffer wrote: Given that structs have become extremely powerful, and with the advent of opDispatch, would it be possible to deprecate supporting COM via D interfaces in favor of a library solution? There are some crappy drawbacks for having interface be dual-purposed: - Although 99.9% of interfaces are actually instances of Object, you can't call Object functions directly on an interface. This includes opEquals, opCmp, toString and friends. - They are not implicitly castable to Object. - There is no centralized base interface, so there is no argument type you can use that can accept any interface. For instance, if you wanted to do some runtime reflection to determine an interface's methods or something. - All these drawbacks are absolutely pointless on non-Microsoft OSes. We are in the process of getting rid of builtin complex types in favor of library types -- can we do something similar? I admit I am not familiar with COM, except my few experiences with it I hated :) Does anyone actually use D interfaces to represent COM objects? What would the drawbacks be of interfacing COM objects via compile-time generated structs that do the busy work? Another possibility is simply making normal interfaces derive from Object, and COM interfaces not. I think this has been discussed before. -Steve I use them to create and manipulate Word and Excel documents thru Automation. But yes, I think that it overcomplicates the implementation of interfaces for something that is not that common. Another possibility is to create a property, maybe @cominterface or something, and only those interfaces marked with that property would work with COM. Although I don't know the feasibility of this, or if it complicates the language.
Re: Druntime AA interface could be enhanced a bit
On Fri, 09 Apr 2010 12:08:54 -0700, Walter Bright wrote: > Michael Rynn wrote: >> In playing around trying to understand and make a better AA, I have >> updated again the dsource/projects/aa/trunk/druntime/aaA.d, this time >> to be a single linked list version. > > I don't understand, the latest release already changed aaA.d to use a > singly linked list. Well, well, this is the power of synchronicity. After I picked up my jaw and eyeballs from the floor, I checked out the latest. The new aaA.d is good, but I think I can do better. I am still testing out some ideas, and still learning, so I will post an update to bugzilla after my brain reset.
Re: OpEquals and Interfaces
On Tue, 13 Apr 2010 19:47:13 -0400, Jason House wrote: Steven Schveighoffer Wrote: On Tue, 13 Apr 2010 15:50:36 -0400, Christoph Mueller wrote: > I'm currently writing a library in D2 which uses intensively interfaces > and i meet a problem by overloading the opEquals Operator. > > In some of my implementations i want to compare an object through an > interface of another instance > > Unfortanetly, the opEquals Operator uses only Object parameters and > according to the current DMD-Compiler it's not possible to cast implicit > an Interface to an Object. (Got a nice compiler error) > > Is there any reason to forbid implicit downcasting from any interface to > Object? Any good reason? No. But the stated reason is usually that interfaces don't necessarily have to be Objects, they can be COM objects, which 1) has no bearing in some OSes, and 2) does anyone use this feature? It could also be a C++ class C++ classes are already tagged with extern(C++) (i.e. there's a difference the compiler can grip to make it behave differently) Maybe that's the solution, make COM solutions extern(COM), or @COM or something. That's fine also. Anything to then allow the compiler to assume unadorned interfaces are D objects. -Steve
Re: OpEquals and Interfaces
Steven Schveighoffer Wrote: > On Tue, 13 Apr 2010 15:50:36 -0400, Christoph Mueller > wrote: > > > I'm currently writing a library in D2 which uses intensively interfaces > > and i meet a problem by overloading the opEquals Operator. > > > > In some of my implementations i want to compare an object through an > > interface of another instance > > > > Unfortanetly, the opEquals Operator uses only Object parameters and > > according to the current DMD-Compiler it's not possible to cast implicit > > an Interface to an Object. (Got a nice compiler error) > > > > Is there any reason to forbid implicit downcasting from any interface to > > Object? > > Any good reason? No. > > But the stated reason is usually that interfaces don't necessarily have to > be Objects, they can be COM objects, which 1) has no bearing in some OSes, > and 2) does anyone use this feature? It could also be a C++ class
Re: [feedback] folding in scintilla
Andrej Mitrovic Wrote: > Personally, I would prefer the left and right brace to stay on the same > line as the function definition, and maybe add an elipsis between them > so I can tell that function is folded just by looking at the code. That's a bit like how Zeus does it's folding: http://www.zeusedit.com/images/lookfold.png
Re: What do you think about that ?
Hello Ezneh, I think it's not really serious but these examples are pretty true ... SO Andrei, Walter, have you got some beard for now ? Last time any one check Walter, didn't. OTOH the last time thoses links came up was when Walter mentioned doing the evil genus things with a mustache so who knows, there might be hope :) -- ... <
Re: OpEquals and Interfaces
On Tue, 13 Apr 2010 19:13:31 -0400, Steven Schveighoffer wrote: I confirmed with 2.043 that this is a bug. I'll file with a test case. http://d.puremagic.com/issues/show_bug.cgi?id=4088
Is it time to deprecate COM compatibility through D interfaces?
Given that structs have become extremely powerful, and with the advent of opDispatch, would it be possible to deprecate supporting COM via D interfaces in favor of a library solution? There are some crappy drawbacks for having interface be dual-purposed: - Although 99.9% of interfaces are actually instances of Object, you can't call Object functions directly on an interface. This includes opEquals, opCmp, toString and friends. - They are not implicitly castable to Object. - There is no centralized base interface, so there is no argument type you can use that can accept any interface. For instance, if you wanted to do some runtime reflection to determine an interface's methods or something. - All these drawbacks are absolutely pointless on non-Microsoft OSes. We are in the process of getting rid of builtin complex types in favor of library types -- can we do something similar? I admit I am not familiar with COM, except my few experiences with it I hated :) Does anyone actually use D interfaces to represent COM objects? What would the drawbacks be of interfacing COM objects via compile-time generated structs that do the busy work? Another possibility is simply making normal interfaces derive from Object, and COM interfaces not. I think this has been discussed before. -Steve
Re: OpEquals and Interfaces
On Tue, 13 Apr 2010 19:16:21 -0400, Steven Schveighoffer wrote: On Tue, 13 Apr 2010 17:27:20 -0400, Christoph Mueller wrote: If you are using D2, there is a workaround: interface I { final bool opEquals(I other) { Object me = cast(Object)this; Object they = cast(Object)other; return equals(me, they); } } But it would be nice if the compiler did this automatically. There are other things that suck because interfaces are not assumed to be derived from Object. What kind of things also doesn't work relating to interfaces and Object ? Any base method of Object -- opCmp, toHash, or simply passing an interface to a function that accepts an Object. Add toString to that, that's a biggie... -Steve
Re: OpEquals and Interfaces
On Tue, 13 Apr 2010 17:27:20 -0400, Christoph Mueller wrote: If you are using D2, there is a workaround: interface I { final bool opEquals(I other) { Object me = cast(Object)this; Object they = cast(Object)other; return equals(me, they); } } But it would be nice if the compiler did this automatically. There are other things that suck because interfaces are not assumed to be derived from Object. What kind of things also doesn't work relating to interfaces and Object ? Any base method of Object -- opCmp, toHash, or simply passing an interface to a function that accepts an Object. There are others, but I can't think of them right now. -Steve
Re: OpEquals and Interfaces
On Tue, 13 Apr 2010 18:14:55 -0400, Kagamin wrote: Steven Schveighoffer Wrote: Any good reason? No. But the stated reason is usually that interfaces don't necessarily have to be Objects, they can be COM objects, which 1) has no bearing in some OSes, and 2) does anyone use this feature? Which feature? Interfaces? COM? upcasting? or opEquals? I don't use opEquals. :3 COM via D interfaces. -Steve
Re: OpEquals and Interfaces
On Tue, 13 Apr 2010 18:04:04 -0400, Ali Çehreli wrote: Christoph Mueller is asking the exact problem that I've been having. :) Steven Schveighoffer wrote: > If you are using D2, there is a workaround: > > interface I > { >final bool opEquals(I other) >{ > Object me = cast(Object)this; > Object they = cast(Object)other; > return equals(me, they); Is 'equals' a function on this interface? No, it's a new global function in object.di, and it's called opEquals, not equals. I guess I messed that up royally :) Just do me == they; That will call the function. But it still calls Object.opEquals: Error: function object.opEquals (Object lhs, Object rhs) is not callable using argument types (I,I) This is not Object.opEquals, it's object.opEquals -- note the capitalization. The latter is a standalone function in module object. Error: cannot implicitly convert expression (i) of type deneme.I to object.Object Error: cannot implicitly convert expression (i) of type deneme.I to object.Object This is a bug. i == i should call the opEquals method on i. Instead, the compiler is trying to call the standalone opEquals(Object, Object) function. I confirmed with 2.043 that this is a bug. I'll file with a test case. -Steve
Re: OpEquals and Interfaces
Steven Schveighoffer Wrote: > Any good reason? No. > > But the stated reason is usually that interfaces don't necessarily have to > be Objects, they can be COM objects, which 1) has no bearing in some OSes, > and 2) does anyone use this feature? Which feature? Interfaces? COM? upcasting? or opEquals? I don't use opEquals. :3
Re: OpEquals and Interfaces
Christoph Mueller is asking the exact problem that I've been having. :) Steven Schveighoffer wrote: > If you are using D2, there is a workaround: > > interface I > { >final bool opEquals(I other) >{ > Object me = cast(Object)this; > Object they = cast(Object)other; > return equals(me, they); Is 'equals' a function on this interface? >} > } I could not make that work. The compiler is still trying to call Object.opEquals. At the risk of turning this forum to D.learn, I tried to have an explicit 'equals' member function on the interface and use that. This is what I have: import object; interface I { final bool opEquals(I other) { return equals(cast(Object)other); } bool equals(Object o); } class C : I { override bool opEquals(Object o) const { return true; } bool equals(Object o) { return this == o; } } void main() { I i = new C; i == i; // <-- Error line } But it still calls Object.opEquals: Error: function object.opEquals (Object lhs, Object rhs) is not callable using argument types (I,I) Error: cannot implicitly convert expression (i) of type deneme.I to object.Object Error: cannot implicitly convert expression (i) of type deneme.I to object.Object Ali
Re: SListRange, Ranges, costructors
> 1) I've seen that the Node struct inside std.range.SListRange is not a > static struct, is it a small bug? Bug 4087. Currently it's not a bug. > 2) In that List(T) I've used the Range protocol. But I've had to > add a reset() function too. Isn't it useful for foreach() to call > a function similar to reset() (if present) at the begin of the > iteration, to set the variables necessary for the iteration? I've taken a look at SListRange, and it seems the p pointer is useless, this works, and there is no need for the reset(): import std.stdio; struct List(T) { static struct Node { T data; Node* next; this(T d, Node* p) { data = d; next = p; } } Node* lh; this(T[] arr) { foreach_reverse (el; arr) lh = new Node(el, lh); } bool empty() { return !lh; } T front() { return lh.data; } void popFront() { lh = lh.next; } } void main() { List!int items = [1, 2, 3]; foreach (x; items) write(x, " "); // prints 1 2 3 writeln(); assert(items.lh); foreach (x; items) write(x, " "); // prints 1 2 3 } But I feel dumb and I don't understand, why isn't lh null at the end of the first foreach? > 3) The stack initalization of a struct has some built-in sugar, > while to allocate the struct on the stack you need a this() method: Enhancement request 4086. Bye, bearophile
Re: OpEquals and Interfaces
If you are using D2, there is a workaround: interface I { final bool opEquals(I other) { Object me = cast(Object)this; Object they = cast(Object)other; return equals(me, they); } } But it would be nice if the compiler did this automatically. There are other things that suck because interfaces are not assumed to be derived from Object. What kind of things also doesn't work relating to interfaces and Object ? -- ruunhb [AT] googlemail.com http://www.ruuns.de/blog
Re: OpEquals and Interfaces
On Tue, 13 Apr 2010 15:50:36 -0400, Christoph Mueller wrote: I'm currently writing a library in D2 which uses intensively interfaces and i meet a problem by overloading the opEquals Operator. In some of my implementations i want to compare an object through an interface of another instance Unfortanetly, the opEquals Operator uses only Object parameters and according to the current DMD-Compiler it's not possible to cast implicit an Interface to an Object. (Got a nice compiler error) Is there any reason to forbid implicit downcasting from any interface to Object? Any good reason? No. But the stated reason is usually that interfaces don't necessarily have to be Objects, they can be COM objects, which 1) has no bearing in some OSes, and 2) does anyone use this feature? Of course, explicit downcasting cast(Object) is also possible, but casting everytime for == operator is not really nice. If you are using D2, there is a workaround: interface I { final bool opEquals(I other) { Object me = cast(Object)this; Object they = cast(Object)other; return equals(me, they); } } But it would be nice if the compiler did this automatically. There are other things that suck because interfaces are not assumed to be derived from Object. I think the whole COM interface hack needs to be revisited. -Steve
Re: freepascal and fantom links
Erm, guess that Fantom was the FAN programming language. bearophile was gentle enough to tell us about this language. However, impressive work. I also like freepascal. Here the OS/CPU support really shines. Bjoern On 13/04/2010 18:42, sclytrack wrote: WARNING: Do not read if you don't have the time. Some links to freepascal package not really relevant to D. These packages appear to be a bit new in freepascal. There are two links at the bottom of the link FPC shared library FPC packages http://wiki.freepascal.org/Lazarus/FPC_Libraries --- Fantom programming language. They don't like to be too verbose. Link to the examples. http://fantom.org/doc/examples/index.html
OpEquals and Interfaces
I'm currently writing a library in D2 which uses intensively interfaces and i meet a problem by overloading the opEquals Operator. In some of my implementations i want to compare an object through an interface of another instance Unfortanetly, the opEquals Operator uses only Object parameters and according to the current DMD-Compiler it's not possible to cast implicit an Interface to an Object. (Got a nice compiler error) Is there any reason to forbid implicit downcasting from any interface to Object? Of course, explicit downcasting cast(Object) is also possible, but casting everytime for == operator is not really nice. Chris -- ruunhb [AT] googlemail.com http://www.ruuns.de/blog
Re: value range propagation for _bitwise_ OR
Steven Schveighoffer wrote: > In my opinion? Yes, slower compilers that make code easier to write are > better. I don't spend lots of time compiling, I spend it writing code. > And I don't need to babysit the compiler, it goes off and does its thing. > > Performance is only important in the end result. I'm not saying I want > my compiler to be slow, but I want it to be accurate and useful more > than I want it to be quick. In this case, there is a quick and fully > accurate solution, so it doesn't matter. But, for instance, if the > compiler could do a full analysis to check if variables escape their > scope, and that makes the compiler 5x slower, then I'd rather have the > compiler verify my work instead of quickly producing memory-corrupting > code. > This is why there are static verification tools. I'm all in favor of this kind of tools, but I don't want to have to put up with their slowness every time I compile something. Having a quick write-compile-test cycle is very important IMO. Then when I have a rough outline of the program, I can run another tool (or adding a command line option) to enable extra checking (and then it doesn't matter if the tool takes the whole night to run and I get the results the next morning when I get to work). Jerome -- mailto:jeber...@free.fr http://jeberger.free.fr Jabber: jeber...@jabber.fr signature.asc Description: OpenPGP digital signature
Re: [feedback] folding in scintilla
Nick Sabalausky Wrote: > So, it would be like this mockup (top is sun-style, bottom is multi-line): > > http://www.semitwist.com/download/goodFoldingMore.png > Scintilla's folding is line-based: each line is assigned a fold level and folder hides lines according to these levels (kinda simple python-friendly algorithm), visible lines are displayed in their entirety.
Re: Fatal flaw in D design which is holding back widespread adoption
BCS Wrote: > I'm not even half joking (maybe a third). This is not a solvable problem. > That said, I think just flat out avoiding any official stance on formatting > would be better than what is there. Indeed. I don't think it should be solved, either. I just don't think the style guide should state that tabs are 8 characters and indents are 4 characters. When that's the case, you can't use tabs to indent. You also have issues when moving around with the arrow keys in the indention areas, and other small issues. If indeed there should be 8 character tabs, then so be it. Just don't mix it up. > But I really have a hard time caring > about the opinions of someone who would drop a language over such an issue. Aww. If you couldn't tell from the humor, it was a joke. And no, I don't use notepad, either. -Matt
Re: [feedback] folding in scintilla
"maXmo" wrote in message news:hq1k6n$2om...@digitalmars.com... > Nick Sabalausky Wrote: > >> Ahh, I see. No, that's not what I was talking about. This is a mockup of >> the >> way I've been wanting it: >> >> http://www.semitwist.com/download/goodFolding.png >> >> Putting it on the line with the "{" seem ridiculous, ugly and just plain >> sloppy to me. (IMO). >> > 1. How should it work for sun style? > 2. How should it work for multiline function signature? By "sun-style", I assume you mean like this: foo { } right? Well, the current folding rule scintilla uses is something like this: - Folding starts at (but does not include) any "{" in the source. What I have in mind is more like: - Folding starts at (but does not include) the last non-whitespace character just before any "{" in the source. (This would also make it work for a language like Go^H^H Issue 9) So, it would be like this mockup (top is sun-style, bottom is multi-line): http://www.semitwist.com/download/goodFoldingMore.png In any case, even if neither the opening nor closing curly brace is visible, I think the horizontal rule is sufficient in indicating that a block of code is hidden. But, if some people think that's too subtle (or to allow a VS.NET-style "hover to show the folded code in a popup tooltip"), then it could also do what Andrej suggested and place a specially-highlighted "{...}" (actually including the three dots) at the end of the line just before the horizontal rule. To properly handle something like this: void foo() { int x; { auto f = openFile(); scope(exit) closeFile(f); bar(f); } baz(); } I suppose you could modify the rule to something more like: - Folding starts at (but does not include) the last non-whitespace character just before any "{" in the source, as long as that character is a ")", otherwise just fold at (but not including) the whitespace character immediately before the "{" in question. or - Folding starts at (but does not include) the last non-whitespace character just before any "{" in the source, unless that character is a semicolon or another "{", in which case just fold at (but not including) the whitespace character immediately before the "{" in question.
What do you think about that ?
Hi there, this topic isn't really interesting imo but maybe ... I wanted to ask Andrei and Walter (and anyone who wants to give me a response) about those few links : http://blogs.microsoft.co.il/blogs/tamir/archive/2008/04/28/computer-languages-and-facial-hair-take-two.aspx http://www.alenz.org/mirror/khason/why-microsoft-can-blow-off-with-c.html I think it's not really serious but these examples are pretty true ... SO Andrei, Walter, have you got some beard for now ? Do you think that having or not a beard will decide about the D future ?
Re: [feedback] folding in scintilla
maXmo wrote: > Nick Sabalausky Wrote: > >> Ahh, I see. No, that's not what I was talking about. This is a mockup of the >> way I've been wanting it: >> >> http://www.semitwist.com/download/goodFolding.png >> >> Putting it on the line with the "{" seem ridiculous, ugly and just plain >> sloppy to me. (IMO). >> > 1. How should it work for sun style? > 2. How should it work for multiline function signature? I believe that XEmacs does the right thing here: fold on the line that contains the closing parenthesis of the function signature (or the if/switch/for/while condition), except for isolated blocks where they fold on the the line of the opening brace. Jerome -- mailto:jeber...@free.fr http://jeberger.free.fr Jabber: jeber...@jabber.fr signature.asc Description: OpenPGP digital signature
Re: value range propagation for _bitwise_ OR
On Tue, 13 Apr 2010 13:37:13 -0400, Jérôme M. Berger wrote: Steven Schveighoffer wrote: Jérôme M. Berger Wrote: Steven Schveighoffer wrote: When we're talking about the difference between O(1) and O(lgn), I'll take accuracy over speed in my compiler any day. And when we're talking about the difference between 10s and 55s for a minimal loss of accuracy, which will you take? Especially if the accuracy loss is less than is lost elsewhere (due to holes in the ranges). Really? You rebuilt the compiler with your range propagation algorithm and verified that it adds 10 seconds versus an accurate one that adds 55s? How much time did the compiler spend to compile? I'd hazard to guess that a code base that adds 10s worth of your algorithm takes at least a few hours to compile. Is 55s that bad at that point? Again, if it takes the compiler an extra insignificant amount of time to be more accurate, I'm all for accuracy over speed when you get down to that level of insignificance. I'd say the point of pain has to be at least 10% of the compile time before it makes any sizable difference. My point is that if you always choose an algorithm that is 5 to 6 times slower just because it brings extra precision which you may not really need, then you will wind up with a compiler that is 5 to 6 times slower than it needs to be. Sure the difference on *one* function is not great in absolute terms, but if you make the same choice for *all* functions, then where do you go? In my opinion? Yes, slower compilers that make code easier to write are better. I don't spend lots of time compiling, I spend it writing code. And I don't need to babysit the compiler, it goes off and does its thing. Performance is only important in the end result. I'm not saying I want my compiler to be slow, but I want it to be accurate and useful more than I want it to be quick. In this case, there is a quick and fully accurate solution, so it doesn't matter. But, for instance, if the compiler could do a full analysis to check if variables escape their scope, and that makes the compiler 5x slower, then I'd rather have the compiler verify my work instead of quickly producing memory-corrupting code. -Steve
Re: freepascal and fantom links
sclytrack Wrote: > Fantom programming language. They don't like to be too verbose. > Link to the examples. > > http://fantom.org/doc/examples/index.html Interesting. Looks like they did a lot.
Re: value range propagation for _bitwise_ OR
Steven Schveighoffer wrote: > using your highbit function, which is very neat BTW > Thanks, but I can't claim credit for it. It's a classical algorithm which comes up regularly in programming discussions and newsgroups... > uint maxor(uint mina, uint maxa, uint minb, uint maxb) > { > return maxa | maxb | ((1 << highbit(((1 << (max(highbit(mina ^ > maxa), highbit(minb ^ maxb)) + 1)) - 1) & maxa & maxb)) - 1); > } > Neat! Jerome -- mailto:jeber...@free.fr http://jeberger.free.fr Jabber: jeber...@jabber.fr signature.asc Description: OpenPGP digital signature
Re: value range propagation for _bitwise_ OR
Steven Schveighoffer wrote: > Steven Schveighoffer Wrote: > >> I'll have to double check, I thought I copied your code verbatim (I did >> switch around the arguments to be minA, maxA, minB, maxB to fit my test >> harness, and I changed the uint_32_t to uint). I'll check tomorrow at work >> where the computer I used to test is. >> > > I think I see the issue, I was using an older version of your highbit, at > some point you changed it, but didn't repeat it in this solution, so I > searched for it on the newsgroup and found your first version. That differs > from later versions you posted, but I can't find where you identified there > was a bug. > > I'll test again tomorrow, but it looks like that probably accounts for the > problem. > > The version I was using: > http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=108788 > > The later version: > http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=108855 > > It looks like in that second post that Ali had the same problem I did. > > Next time, you should include your whole code each time, especially parts you > changed. > Oups, I hadn't noticed that I sent two different versions (except for the C<->D translation). Sorry about that. Note that there is no bug: both versions are correct. The first one uses 1-based indexes, while the second uses 0-based indexes an processes 0 differently. Jerome -- mailto:jeber...@free.fr http://jeberger.free.fr Jabber: jeber...@jabber.fr signature.asc Description: OpenPGP digital signature
Re: value range propagation for _bitwise_ OR
Fawzi Mohamed wrote: > > On 13-apr-10, at 12:02, Don wrote: >> It's been part of DMD2 for a while now. It allows you to do things like: >> >> ubyte lowbits(int x) >> { >>return x & 7; >> } >> >> without an explicit cast. The compiler knows that x&7 can safely fit >> inside a single byte. Whereas ((x&7) << 12) | (x&3); >> does not fit, and requires an explicit cast. > > ah ok I understand, I thought that was treated like x & cast(ubyte)7 , > and so as comparison of the compile time value with the ranges of ubyte > (no range propagation needed). > But I can understand why it is treated as cast(ubyte)((cast(int)x) & 7), > as it is probably easier for the compiler, as it upcasts by default. > > Thanks for the explanation. > Note that in Don's example, "x" is an int, not a ubyte, so you don't need the cast to "int" in your second example, but you still need range propagation in the first... Jerome -- mailto:jeber...@free.fr http://jeberger.free.fr Jabber: jeber...@jabber.fr signature.asc Description: OpenPGP digital signature
Re: value range propagation for _bitwise_ OR
Steven Schveighoffer wrote: > Jérôme M. Berger Wrote: > >> Steven Schveighoffer wrote: >>> When we're talking about the difference between O(1) and O(lgn), I'll >>> take accuracy over speed in my compiler any day. >> And when we're talking about the difference between 10s and 55s for >> a minimal loss of accuracy, which will you take? Especially if the >> accuracy loss is less than is lost elsewhere (due to holes in the >> ranges). > > Really? You rebuilt the compiler with your range propagation algorithm and > verified that it adds 10 seconds versus an accurate one that adds 55s? How > much time did the compiler spend to compile? I'd hazard to guess that a code > base that adds 10s worth of your algorithm takes at least a few hours to > compile. Is 55s that bad at that point? > > Again, if it takes the compiler an extra insignificant amount of time to be > more accurate, I'm all for accuracy over speed when you get down to that > level of insignificance. I'd say the point of pain has to be at least 10% of > the compile time before it makes any sizable difference. > My point is that if you always choose an algorithm that is 5 to 6 times slower just because it brings extra precision which you may not really need, then you will wind up with a compiler that is 5 to 6 times slower than it needs to be. Sure the difference on *one* function is not great in absolute terms, but if you make the same choice for *all* functions, then where do you go? Jerome -- mailto:jeber...@free.fr http://jeberger.free.fr Jabber: jeber...@jabber.fr signature.asc Description: OpenPGP digital signature
freepascal and fantom links
WARNING: Do not read if you don't have the time. Some links to freepascal package not really relevant to D. These packages appear to be a bit new in freepascal. There are two links at the bottom of the link FPC shared library FPC packages http://wiki.freepascal.org/Lazarus/FPC_Libraries --- Fantom programming language. They don't like to be too verbose. Link to the examples. http://fantom.org/doc/examples/index.html
Re: value range propagation for _bitwise_ OR
Don wrote: Adam D. Ruppe wrote: On Tue, Apr 13, 2010 at 11:10:24AM -0400, Clemens wrote: That's strange. Looking at src/backend/cod4.c, function cdbscan, in the dmd sources, bsr seems to be implemented in terms of the bsr opcode [1] (which I guess is the reason it's an intrinsic in the first place). I would have expected this to be much, much faster than a user function. Anyone care enough to check the generated assembly? The opcode is fairly slow anyway (as far as opcodes go) - odds are the implementation inside the processor is similar to Jerome's method, and the main savings come from it loading fewer bytes into the pipeline. I remember a line from a blog, IIRC it was the author of the C++ FQA writing it, saying hardware and software are pretty much the same thing - moving an instruction to hardware doesn't mean it will be any faster, since it is the same algorithm, just done in processor microcode instead of user opcodes. It's fast on Intel, slow on AMD. I bet the speed difference comes from inlining max(). Specifically, bsr is 7 uops on AMD, 1 uop on Intel since the original Pentium. AMD's performance is shameful. And bsr() is supported in the compiler; in fact DMC uses it extensively, which is why it's included in DMD!
Re: value range propagation for _bitwise_ OR
Adam D. Ruppe wrote: On Tue, Apr 13, 2010 at 11:10:24AM -0400, Clemens wrote: That's strange. Looking at src/backend/cod4.c, function cdbscan, in the dmd sources, bsr seems to be implemented in terms of the bsr opcode [1] (which I guess is the reason it's an intrinsic in the first place). I would have expected this to be much, much faster than a user function. Anyone care enough to check the generated assembly? The opcode is fairly slow anyway (as far as opcodes go) - odds are the implementation inside the processor is similar to Jerome's method, and the main savings come from it loading fewer bytes into the pipeline. I remember a line from a blog, IIRC it was the author of the C++ FQA writing it, saying hardware and software are pretty much the same thing - moving an instruction to hardware doesn't mean it will be any faster, since it is the same algorithm, just done in processor microcode instead of user opcodes. It's fast on Intel, slow on AMD. I bet the speed difference comes from inlining max().
Re: value range propagation for _bitwise_ OR
On Tue, Apr 13, 2010 at 11:10:24AM -0400, Clemens wrote: > That's strange. Looking at src/backend/cod4.c, function cdbscan, in the dmd > sources, bsr seems to be implemented in terms of the bsr opcode [1] (which I > guess is the reason it's an intrinsic in the first place). I would have > expected this to be much, much faster than a user function. Anyone care > enough to check the generated assembly? The opcode is fairly slow anyway (as far as opcodes go) - odds are the implementation inside the processor is similar to Jerome's method, and the main savings come from it loading fewer bytes into the pipeline. I remember a line from a blog, IIRC it was the author of the C++ FQA writing it, saying hardware and software are pretty much the same thing - moving an instruction to hardware doesn't mean it will be any faster, since it is the same algorithm, just done in processor microcode instead of user opcodes. -- Adam D. Ruppe http://arsdnet.net
Re: value range propagation for _bitwise_ OR
On Tue, Apr 13, 2010 at 10:52:14AM -0400, Steven Schveighoffer wrote: > Just a note, this code should be runnable in C++, because the compiler is > written in C++. Is bsr available there? I just assumed since it was in D that it was probably in DMC too, but I can't find it in the docs, so maybe not. Well, there's always inline asm :D But 300 milliseconds over 100 million iterations is negligible anyway. -- Adam D. Ruppe http://arsdnet.net
Re: value range propagation for _bitwise_ OR
Adam Ruppe Wrote: > Jerome's highbit function is the same as std.intrinsic.bsr. I wonder > which is faster? > > I ran a test, and for 100 million iterations (1..1000), the > intrinsic beat out his function be a mere 300 milliseconds on my box. > - highbit ran in an average of 1897 ms and bsr did the same in an > average if 1534. > > Recompiling with -inline -O -release cuts the raw numbers about in > half, but keeps about the same difference, leading me to think > overhead amounts for a fair amount of the percentage instead of actual > implementation. The new averages are 1134 and 853. That's strange. Looking at src/backend/cod4.c, function cdbscan, in the dmd sources, bsr seems to be implemented in terms of the bsr opcode [1] (which I guess is the reason it's an intrinsic in the first place). I would have expected this to be much, much faster than a user function. Anyone care enough to check the generated assembly? [1] http://www.itis.mn.it/linux/quarta/x86/bsr.htm -- Clemens
Re: value range propagation for _bitwise_ OR
Steven Schveighoffer: > I never would have thought of doing a binary search for the high bit, it > is definitely a novel idea to me :) You can learn something from this page (it can also be useful for the translation to C++): http://graphics.stanford.edu/~seander/bithacks.html Bye, bearophile
Re: value range propagation for _bitwise_ OR
On Tue, 13 Apr 2010 09:56:34 -0400, Adam Ruppe wrote: Jerome's highbit function is the same as std.intrinsic.bsr. I wonder which is faster? Just a note, this code should be runnable in C++, because the compiler is written in C++. Is bsr available there? Recompiling with -inline -O -release cuts the raw numbers about in half, but keeps about the same difference, leading me to think overhead amounts for a fair amount of the percentage instead of actual implementation. The new averages are 1134 and 853. Probably the inlining accounts for the savings here. Calling a function is rather expensive. I've gotta say, your implementation of the function is better than I would have done though. Without bsr, I probably would have looped, shifted, and tested each individual bit... I never would have thought of doing a binary search for the high bit, it is definitely a novel idea to me :) -Steve
Re: value range propagation for _bitwise_ OR
Jerome's highbit function is the same as std.intrinsic.bsr. I wonder which is faster? I ran a test, and for 100 million iterations (1..1000), the intrinsic beat out his function be a mere 300 milliseconds on my box. - highbit ran in an average of 1897 ms and bsr did the same in an average if 1534. Recompiling with -inline -O -release cuts the raw numbers about in half, but keeps about the same difference, leading me to think overhead amounts for a fair amount of the percentage instead of actual implementation. The new averages are 1134 and 853. I've gotta say, your implementation of the function is better than I would have done though. Without bsr, I probably would have looped, shifted, and tested each individual bit... On 4/13/10, Steven Schveighoffer wrote: > On Mon, 12 Apr 2010 22:37:00 -0400, Steven Schveighoffer > wrote: > >> Jérôme M. Berger Wrote: >> >>> Steven Schveighoffer wrote: >>> > Fails for test case: >>> > >>> > minA = 4, maxA = 4, minB = 4, maxB = 6 (outputs 7, accurate result is >>> 6). >>> > >>> Nope, outputs 6. Note that I've run an exhaustive search for all >>> combinations in the [0, 63] range, so if there are mistakes they >>> have to be outside that range... >>> >> >> I'll have to double check, I thought I copied your code verbatim (I did >> switch around the arguments to be minA, maxA, minB, maxB to fit my test >> harness, and I changed the uint_32_t to uint). I'll check tomorrow at >> work where the computer I used to test is. > > I confirm, with the updated highbit, your solution works. > > Also, inspired by your solution (and using your highbit function, which is > very neat BTW) I came up with a one-liner. Enjoy :) > > uint maxor(uint mina, uint maxa, uint minb, uint maxb) > { > return maxa | maxb | ((1 << highbit(((1 << (max(highbit(mina ^ maxa), > highbit(minb ^ maxb)) + 1)) - 1) & maxa & maxb)) - 1); > } > > Anyone want to do the equivalent minor? I've spent way too much time on > this :) > > -Steve >
Re: value range propagation for _bitwise_ OR
Steven Schveighoffer wrote: On Mon, 12 Apr 2010 22:37:00 -0400, Steven Schveighoffer wrote: Jérôme M. Berger Wrote: Steven Schveighoffer wrote: > Fails for test case: > > minA = 4, maxA = 4, minB = 4, maxB = 6 (outputs 7, accurate result is 6). > Nope, outputs 6. Note that I've run an exhaustive search for all combinations in the [0, 63] range, so if there are mistakes they have to be outside that range... I'll have to double check, I thought I copied your code verbatim (I did switch around the arguments to be minA, maxA, minB, maxB to fit my test harness, and I changed the uint_32_t to uint). I'll check tomorrow at work where the computer I used to test is. I confirm, with the updated highbit, your solution works. Also, inspired by your solution (and using your highbit function, which is very neat BTW) I came up with a one-liner. Enjoy :) uint maxor(uint mina, uint maxa, uint minb, uint maxb) { return maxa | maxb | ((1 << highbit(((1 << (max(highbit(mina ^ maxa), highbit(minb ^ maxb)) + 1)) - 1) & maxa & maxb)) - 1); } Anyone want to do the equivalent minor? I've spent way too much time on this :) -Steve Awesome stuff, guys. I think that's worthy of inclusion as an exercise in Knuth's Volume 4, fascile 1. Suggest it to someone at ACCU? (Knuth has highbit(x) but calls it lambda(x). He notes the result that highbit(x)==highbit(y) iff x^y <= x&y).
Re: Automatic opApply iteration counter
On Tue, 13 Apr 2010 06:13:37 -0400, Steven Schveighoffer wrote: Don't forget, opApply is just a function, you can use it as such :) Oh, and don't forget to always make opApply take a scope delegate! Otherwise, it allocates the calling function's frame on the heap (in D2 at least). -Steve
Re: value range propagation for _bitwise_ OR
On Mon, 12 Apr 2010 22:37:00 -0400, Steven Schveighoffer wrote: Jérôme M. Berger Wrote: Steven Schveighoffer wrote: > Fails for test case: > > minA = 4, maxA = 4, minB = 4, maxB = 6 (outputs 7, accurate result is 6). > Nope, outputs 6. Note that I've run an exhaustive search for all combinations in the [0, 63] range, so if there are mistakes they have to be outside that range... I'll have to double check, I thought I copied your code verbatim (I did switch around the arguments to be minA, maxA, minB, maxB to fit my test harness, and I changed the uint_32_t to uint). I'll check tomorrow at work where the computer I used to test is. I confirm, with the updated highbit, your solution works. Also, inspired by your solution (and using your highbit function, which is very neat BTW) I came up with a one-liner. Enjoy :) uint maxor(uint mina, uint maxa, uint minb, uint maxb) { return maxa | maxb | ((1 << highbit(((1 << (max(highbit(mina ^ maxa), highbit(minb ^ maxb)) + 1)) - 1) & maxa & maxb)) - 1); } Anyone want to do the equivalent minor? I've spent way too much time on this :) -Steve
Re: [feedback] folding in scintilla
Nick Sabalausky Wrote: > Ahh, I see. No, that's not what I was talking about. This is a mockup of the > way I've been wanting it: > > http://www.semitwist.com/download/goodFolding.png > > Putting it on the line with the "{" seem ridiculous, ugly and just plain > sloppy to me. (IMO). > 1. How should it work for sun style? 2. How should it work for multiline function signature?
Re: value range propagation for _bitwise_ OR
On 13-apr-10, at 12:01, bearophile wrote: Fawzi Mohamed: I guess that I am missing something obvious, so I don't see the reason for range propagation, but maybe I am not the only one, so thanks for an explanation... Range propagation can improve the situation a little, so it's not bad. But it can't replace proper integral overflows. You need integral overflows at runtime. integral overflow are helpful only if you have automatic conversion to a larger type, but that breaks the compile time knowledge of the size of such an integer, so that you have to assume that it might need to be pushed to the heap. Yes you might use some tricks (like using size_t.sizeof*8-1 bits, or a pointer to spare some place), but I don't think that D wants to go that way for the basic integers... Bye, bearophile
Re: value range propagation for _bitwise_ OR
On 13-apr-10, at 12:02, Don wrote: Fawzi Mohamed wrote: On 12-apr-10, at 21:40, Steven Schveighoffer wrote: On Mon, 12 Apr 2010 13:45:14 -0400, Jérôme M. Berger fr> wrote: Steven Schveighoffer wrote: J�r�me M. Berger Wrote: J�r�me M. Berger wrote: OK, I found a solution that is optimal in over 99% of the cases (99.195% to be precise), while still being fast (because it avoids looping over the bits): I've run my function and Andrei's on all possible min, max pairs in 0..299 without checking, just for the timing. On my computer (Athlon X2 64 @2GHz), my function took 50s and Andrei's took 266s. The difference should be even bigger for larger numbers. Can I ask, what is the point of having a non-exact solution (unless you are just doing this for fun)? We are talking about range propagation, a function of the compiler, not a function of the compiled program. Therefore, slower but more exact functions should be preferred, since they only affect the compile time. Note that you will only need to do this range propagation on lines that "or" two values together, and something that reduces the runtime of the compiler by 216 seconds, but only when compiling enough code to have over 8 billion 'or' operations in it (300^4), I don't think is really that important. Give me the exact solution that prevents me from having to cast when the compiler can prove it, even if the runtime of the compiler is slightly slower. Funny you should say that given the current thread comparing the speed of the D compiler to that of the Go compiler... My point was simply that the amount of time saved is relative to the size of the program being compiled. If we assume conservatively that every other line in a program has bitwise or in it, then you are talking about a 16 billion line program, meaning the 216 seconds you save is insignificant compared to the total compile time. My point was that your fast solution that is inaccurate is not preferable because nobody notices the speed, they just notice the inaccuracy. We are talking about range propagation, a function of the compiler, not a function of the compiled program. Since we can't get a 100% accurate representation of the possible values anyway (even yours might leave holes in the middle after all), then it might be better to choose a faster, slightly less precise algorithm if the difference is not too great. That's the difference between a compiler and a full-fledged static code analysis an program prover. When we're talking about the difference between O(1) and O(lgn), I'll take accuracy over speed in my compiler any day. The solution should be 100% accurate for the problem statement. If the problem statement is not what we need, then we need a new problem statement :) Solving the problem statement for 99% of values is not good enough. Anyway, the point is moot, I have a new version of my algorithm with 100% precision and high speed. Yes, I'm still trying to understand how it works :) Sorry for the probably stupid question, but I don't understand much the need of all this range propagation, in the compiler either you have a a constant (and then you have the value at compile time, and you don't need any range propagation, you can compare with the value), or you have a runtime value. It's been part of DMD2 for a while now. It allows you to do things like: ubyte lowbits(int x) { return x & 7; } without an explicit cast. The compiler knows that x&7 can safely fit inside a single byte. Whereas ((x&7) << 12) | (x&3); does not fit, and requires an explicit cast. ah ok I understand, I thought that was treated like x & cast(ubyte)7 , and so as comparison of the compile time value with the ranges of ubyte (no range propagation needed). But I can understand why it is treated as cast(ubyte)((cast(int)x) & 7), as it is probably easier for the compiler, as it upcasts by default. Thanks for the explanation.
Re: Automatic opApply iteration counter
bearophile Wrote: > I have not written this as enhancement request on Bugzilla because while I > have a clear (small) problem, I think my possible solution is bad (see at the > bottom). > [snip] > To allow both the foreach version with and without index I have to duplicate > the body of the opApply, but this code duplication is bad: No. This is how I do it in dcollections (for optional keys in map containers): > import std.stdio: writeln; > struct Xfibonacci { > long max; > this(long inmax) { max = inmax; } > int opApply(int delegate(ref long) dg) { int _dg(ref int i, ref long l) { return dg(l); } return opApply(&_dg); > } > int opApply(int delegate(ref int, ref long) dg) { > int result, index; > long a=0, b=1; > while (a < max) { > result = dg(index, b); > if (result) break; > index++; > a += b; > result = dg(index, a); > if (result) break; > b += a; > index++; > } > return result; > } > } Don't forget, opApply is just a function, you can use it as such :)
Re: value range propagation for _bitwise_ OR
Fawzi Mohamed: > I guess that I am missing something obvious, so I don't see the reason > for range propagation, but maybe I am not the only one, so thanks for > an explanation... Range propagation can improve the situation a little, so it's not bad. But it can't replace proper integral overflows. You need integral overflows at runtime. Bye, bearophile
Re: value range propagation for _bitwise_ OR
Fawzi Mohamed wrote: On 12-apr-10, at 21:40, Steven Schveighoffer wrote: On Mon, 12 Apr 2010 13:45:14 -0400, Jérôme M. Berger wrote: Steven Schveighoffer wrote: J�r�me M. Berger Wrote: J�r�me M. Berger wrote: OK, I found a solution that is optimal in over 99% of the cases (99.195% to be precise), while still being fast (because it avoids looping over the bits): I've run my function and Andrei's on all possible min, max pairs in 0..299 without checking, just for the timing. On my computer (Athlon X2 64 @2GHz), my function took 50s and Andrei's took 266s. The difference should be even bigger for larger numbers. Can I ask, what is the point of having a non-exact solution (unless you are just doing this for fun)? We are talking about range propagation, a function of the compiler, not a function of the compiled program. Therefore, slower but more exact functions should be preferred, since they only affect the compile time. Note that you will only need to do this range propagation on lines that "or" two values together, and something that reduces the runtime of the compiler by 216 seconds, but only when compiling enough code to have over 8 billion 'or' operations in it (300^4), I don't think is really that important. Give me the exact solution that prevents me from having to cast when the compiler can prove it, even if the runtime of the compiler is slightly slower. Funny you should say that given the current thread comparing the speed of the D compiler to that of the Go compiler... My point was simply that the amount of time saved is relative to the size of the program being compiled. If we assume conservatively that every other line in a program has bitwise or in it, then you are talking about a 16 billion line program, meaning the 216 seconds you save is insignificant compared to the total compile time. My point was that your fast solution that is inaccurate is not preferable because nobody notices the speed, they just notice the inaccuracy. We are talking about range propagation, a function of the compiler, not a function of the compiled program. Since we can't get a 100% accurate representation of the possible values anyway (even yours might leave holes in the middle after all), then it might be better to choose a faster, slightly less precise algorithm if the difference is not too great. That's the difference between a compiler and a full-fledged static code analysis an program prover. When we're talking about the difference between O(1) and O(lgn), I'll take accuracy over speed in my compiler any day. The solution should be 100% accurate for the problem statement. If the problem statement is not what we need, then we need a new problem statement :) Solving the problem statement for 99% of values is not good enough. Anyway, the point is moot, I have a new version of my algorithm with 100% precision and high speed. Yes, I'm still trying to understand how it works :) Sorry for the probably stupid question, but I don't understand much the need of all this range propagation, in the compiler either you have a a constant (and then you have the value at compile time, and you don't need any range propagation, you can compare with the value), or you have a runtime value. It's been part of DMD2 for a while now. It allows you to do things like: ubyte lowbits(int x) { return x & 7; } without an explicit cast. The compiler knows that x&7 can safely fit inside a single byte. Whereas ((x&7) << 12) | (x&3); does not fit, and requires an explicit cast.
Re: value range propagation for _bitwise_ OR
On 12-apr-10, at 21:40, Steven Schveighoffer wrote: On Mon, 12 Apr 2010 13:45:14 -0400, Jérôme M. Berger wrote: Steven Schveighoffer wrote: J�r�me M. Berger Wrote: J�r�me M. Berger wrote: OK, I found a solution that is optimal in over 99% of the cases (99.195% to be precise), while still being fast (because it avoids looping over the bits): I've run my function and Andrei's on all possible min, max pairs in 0..299 without checking, just for the timing. On my computer (Athlon X2 64 @2GHz), my function took 50s and Andrei's took 266s. The difference should be even bigger for larger numbers. Can I ask, what is the point of having a non-exact solution (unless you are just doing this for fun)? We are talking about range propagation, a function of the compiler, not a function of the compiled program. Therefore, slower but more exact functions should be preferred, since they only affect the compile time. Note that you will only need to do this range propagation on lines that "or" two values together, and something that reduces the runtime of the compiler by 216 seconds, but only when compiling enough code to have over 8 billion 'or' operations in it (300^4), I don't think is really that important. Give me the exact solution that prevents me from having to cast when the compiler can prove it, even if the runtime of the compiler is slightly slower. Funny you should say that given the current thread comparing the speed of the D compiler to that of the Go compiler... My point was simply that the amount of time saved is relative to the size of the program being compiled. If we assume conservatively that every other line in a program has bitwise or in it, then you are talking about a 16 billion line program, meaning the 216 seconds you save is insignificant compared to the total compile time. My point was that your fast solution that is inaccurate is not preferable because nobody notices the speed, they just notice the inaccuracy. We are talking about range propagation, a function of the compiler, not a function of the compiled program. Since we can't get a 100% accurate representation of the possible values anyway (even yours might leave holes in the middle after all), then it might be better to choose a faster, slightly less precise algorithm if the difference is not too great. That's the difference between a compiler and a full-fledged static code analysis an program prover. When we're talking about the difference between O(1) and O(lgn), I'll take accuracy over speed in my compiler any day. The solution should be 100% accurate for the problem statement. If the problem statement is not what we need, then we need a new problem statement :) Solving the problem statement for 99% of values is not good enough. Anyway, the point is moot, I have a new version of my algorithm with 100% precision and high speed. Yes, I'm still trying to understand how it works :) Sorry for the probably stupid question, but I don't understand much the need of all this range propagation, in the compiler either you have a a constant (and then you have the value at compile time, and you don't need any range propagation, you can compare with the value), or you have a runtime value. Do you want to explicitly add in the compiler the support for more limited runtime values? Otherwise the range of a runtime value is a priori the whole possible range, and thus any rule based on range propagation might be expressed as static type based rule (as done in C). You can gain something for example you can know that summing 4 shorts you will never overflow an int, is this where you want to go? What kind of bugs you are trying to avoid? Or is it simply having the max and min properties defined? Nobody commented on my proposal, but there I see the potential for bugs (1u-2)+1UL is 4294967296 and this happens also at runtime if you have, for example, a function returning size_t and another returning uint, and you combine them without thinking much and then you switch to 64 bits... There I see a problem, but this you see without any special range propagation, just thinking that subtraction or negation of unsigned types is modulo 2^bit size, and thus cannot be then changed to another size without explicit cast. Maybe in some cases using enums range propagation might spare a cast, but is it really an improvement? I guess that I am missing something obvious, so I don't see the reason for range propagation, but maybe I am not the only one, so thanks for an explanation... Fawzi
Vectorization and more
Some musings about vectorizations. The benchmarks of the Shootout site sometimes are useful, because they are well chosen to cover a wide variety of computations and kinds of common kernels. One of those benchmarks is the n-body, this is one of the faster implementations, in C: http://shootout.alioth.debian.org/u32/program.php?test=nbody&lang=gcc&id=5 Profiling of the n-body benchmark shows that most of the time is spent in the two nested loops of the advance() function (inside it there is a sqrt that is critical). This is a summary of the code of the C implementation, the increased performance compared to the other versions comes from packing x and y in a SSE register, and z plus an empty value in another, this allows to do three operations (on x, y and z or the speeds) in two istructions instead of three: typedef double v2df __attribute__ (( vector_size(2*sizeof(double)) )); struct planet { v2df xy; v2df z0; /* z and zero */ v2df vxvy; v2df vz00; /* vz and zero */ v2df massmass; /* the mass in both components */ }; static void advance(int q) { ... for (j = i + 1; j < NBODIES; j++) { b2 = &(bodies[j]); dxdy = b->xy - b2->xy; dz00 = b->z0 - b2->z0; /* dx*dx+dy*dy | dz*dz */ tsquare = __builtin_ia32_haddpd(dxdy*dxdy,dz00*dz00); /* dx*dx+dy*dy+dz*dz | dx*dx+dy*dy+dz*dz */ distance2 = __builtin_ia32_haddpd(tsquare,tsquare); magmag = dtdt / (__builtin_ia32_sqrtpd(distance2)*distance2); dxdy *= magmag; dz00 *= magmag; b->vxvy -= dxdy * b2->massmass; b->vz00 -= dz00 * b2->massmass; b2->vxvy += dxdy * b->massmass; b2->vz00 += dz00 * b->massmass; } } ... } Note how the C code performs two sqrt in parallel, even if only one sqrt is necessary, because this helps the compiler keep the values in the SSE registers, avoiding a waste of time (but the CPU will perform two sqrt, so it will waster some electricity. This is not the kind of optimization you want for a netbook that has to save as much power as possible. Contrary to what Chris Lattner thinks, I belive it can exist a compilation switch like -Oe to compile a program with the purpose of saving energy at runtime). Note2: in practice I have seen that LDC with D code is able to produce a binary about as fast as this C one, using no explicit use of SSE registers (and no array ops) :-) Later I have created a D version of the n-body that uses arrays ops (and as expected it's a little slower than the naive version, because array ops are not refined yet in D). This is a summary of the code: alias double[3] Tri; struct Body { Tri xyz = [0,0,0]; Tri v = [0,0,0]; double mass = 0.0; ... } void advance(double dt) { foreach(idx, ref i; bodies) { double im = i.mass; foreach (ref j; bodies[idx + 1 .. length]) { double jm = j.mass; Tri d = void; d[] = i.xyz[] - j.xyz[]; double distance = sqrt(d[0]*d[0] + d[1]*d[1] + d[2]*d[2]); double mag = dt / (distance * distance * distance); i.v[] -= d[] * jm * mag; j.v[] += d[] * im * mag; } } foreach (ref i; bodies) i.xyz[] += dt * i.v[]; } Future D implementations will surely be able to use a SSE register to store a double[2] (and double[4] in 256 bit CPU registers, etc). But to me the problem here seems a little different: given that Tri array, present two times inside the Body struct, will be D compilers able to use two SSE registers (with a padding item), to implement an operation like this, done among arrays of lenght 3? d[] = i.xyz[] - j.xyz[]; I think LLVM devs eventually will add vector ops to LLVM, so LLVM will have to face both: 1) the need to pad registers when the arrays aren't as long as an integral number of SSE registers; 2) to copy scalar values in all slots of a SSE register to keep its operations as fast as possible when mixed with vector ops. (Later I might ask to Lattner about native vector ops in LLVM. This can be really useful for LDC). Bye, bearophile
Automatic opApply iteration counter
I have not written this as enhancement request on Bugzilla because while I have a clear (small) problem, I think my possible solution is bad (see at the bottom). This is how I can use opApply() to create a lazy generator of the first Fibonacci numbers, inside opApply there can be a good amount of code: import std.stdio: writeln; struct Xfibonacci { long max; this(long inmax) { max = inmax; } int opApply(int delegate(ref long) dg) { int result; long a=0, b=1; while (a < max) { result = dg(b); if (result) break; a += b; result = dg(a); if (result) break; b += a; } return result; } } void main() { foreach (f; Xfibonacci(200)) writeln(f); } But in many situations it's good to have an interation counter too, to index the given item: foreach (i, f; Xfibonacci(2_000)) To allow both the foreach version with and without index I have to duplicate the body of the opApply, but this code duplication is bad: import std.stdio: writeln; struct Xfibonacci { long max; this(long inmax) { max = inmax; } int opApply(int delegate(ref long) dg) { int result; long a=0, b=1; while (a < max) { result = dg(b); if (result) break; a += b; result = dg(a); if (result) break; b += a; } return result; } int opApply(int delegate(ref int, ref long) dg) { int result, index; long a=0, b=1; while (a < max) { result = dg(index, b); if (result) break; index++; a += b; result = dg(index, a); if (result) break; b += a; index++; } return result; } } void main() { foreach (i, f; Xfibonacci(2_000)) writeln(i, " ", f); } To avoid the code duplication I can define a generic Enumerate (with its helper function 'enumerate'), very similar to the enumerate() present in Python for the same purpose: import std.stdio: writeln; import std.traits: ReturnType; struct Xfibonacci { long max; this(long inmax) { max = inmax; } int opApply(int delegate(ref long) dg) { int result; long a=0, b=1; while (a < max) { result = dg(b); if (result) break; a += b; result = dg(a); if (result) break; b += a; } return result; } } struct Enumerate(Iter) { Iter iterable; int index; alias ReturnType!(typeof({foreach(x; iterable) return x; assert(0); })) ItemType; int opApply(int delegate(ref int, ref ItemType) dg) { int result; foreach (el; iterable) { result = dg(index, el); if (result) break; index++; } return result; } } Enumerate!Iter enumerate(Iter)(Iter iterable, int start=0) { return Enumerate!Iter(iterable, start); } void main() { foreach (i, f; enumerate(Xfibonacci(2_000))) writeln(i, " ", f); } This is not too much bad, but the Fibonacci numbers now come out of two opApply, so the compiler fail in optimizing the code well. So I propose the automatic generation of the indexed version: import std.stdio: writeln; struct Xfibonacci { long max; this(long inmax) { max = inmax; } int opApply(int delegate(ref long) dg) { int result; long a=0, b=1; while (a < max) { result = dg(b); if (result) break; a += b; result = dg(a); if (result) break; b += a; } return result; } } void main() { foreach (f; Xfibonacci(200)) // OK writeln(f); foreach (i, f; Xfibonacci(200)) // OK writeln(f); } This can cause problems if Xfibonacci already contains an OpApply that yields two or more items... To me it seems to lead to a messy situation... If you have alternative solutions please tell me. Bye, bearophile
Struct template type inference
Inside std.range there is a template struct that implements the chain: struct ChainImpl(R...) { plus a tiny helper function 'chain' that's usually the one used in the user code: Chain!(R) chain(R...)(R input) { static if (input.length > 1) return Chain!(R)(input); else return input[0]; } Similar pairs of struct template + little helper function are quite common in both std.range, in my libraries, and probably in lot of generic D code that uses struct templates. The purpose of the little function is to instantiate the struct template with the correct type. This is a simplified example: struct Foo(T) { T x; } auto foo(T)(T x) { return Foo!T(x); } void main() { auto f = foo(10); } In this code if the helper foo() function doesn't exist, and if the data type T is a bit complicated, the programmer might need to write something like (here T is not complicated): struct Foo(T) { T x; } void main() { auto somevalue = 10; auto f = Foo!(typeof(somevalue))(somevalue); } that is not handy. So to keep code the simple and to avoid the definition of those helper functions, struct templates can be enhanced, so they can infer their template type in some way. But the situation is not always so simple as in that Foo struct template. The struct template can define constructors and initializers. Even if that case of that chain() helper function it contains a bit of compile-time logic to simplify the case with just one iterable. So I don't know if there is some clean way to infer the types for the struct template. Bye, bearophile
Re: Benchmarking in D
On 2010-04-12 11:09, "Jérôme M. Berger" wrote: Chris Mueller wrote: On what OS? On linux, you can do: time foo to get the run time for program foo, including elapsed clock time, time spent in the program itself and time spent in the kernel on behalf of the program (for I/O, mallocs, etc); cat /proc/$(pidof foo)/status to get memory information for a running program. I don't know any way to get the memory information once the program exits. I'm currently using XP, is there any similar way to measure memory consumption? Otherwise i install a quick unix development environment on a separate partition. I believe that the task manager can give that information. Hit Ctl-Alt-Del, then click on the "Task Manager" button. One of the tabs gives information about the running processes. You might need to configure the displayed columns, but I don't remember how it's done and I don't have an XP box on hand to check. Jerome On Vista: View -> Select Columns and make sure Private Working Set is checked. On XP, checking Memory Usage will get you most of the way there, although it's not quite as good (it lumps in any working set shared with other processes too). -- ~ My software never has bugs. It just develops random features. ~ http://tagzilla.mozdev.org v0.066
Re: [feedback] folding in scintilla
"Nick Sabalausky" wrote in message news:hq142v$1hh...@digitalmars.com... > "maXmo" wrote in message > news:hq0qrq$13p...@digitalmars.com... >> I had exactly same reason: old folding works only for sun style, my >> folding treats sun and ms style equally. >> See: http://i42.tinypic.com/2qjx1mf.png > > Ahh, I see. No, that's not what I was talking about. This is a mockup of > the way I've been wanting it: > > http://www.semitwist.com/download/goodFolding.png > > Putting it on the line with the "{" seem ridiculous, ugly and just plain > sloppy to me. (IMO). > Although, I do think your approach is still an improvement over the way it currently is.