LDC: Constant Folding Across Nested Functions?
Background: This came from an attempt to get rid of delegate indirection in parallel foreach loops on LDC. LDC can inline delegates that always point to the same code. This means that it can inline opApply delegates after inlining opApply itself and effectively constant folding the delegate. Simplified case without unnecessarily complex context: // Assume this function does NOT get inlined. // In my real use case it's doing something // much more complicated and in fact does not // get inlined. void runDelegate(scope void delegate() dg) { dg(); } // Assume this function gets inlined into main(). uint divSum(uint divisor) { uint result = 0; // If divisor gets const folded and is a power of 2 then // the compiler can optimize the division to a shift. void doComputation() { foreach(i; 0U..1_000_000U) { result += i / divisor; } } runDelegate(doComputation); } void main() { // divSum gets inlined, to here, but doComputation() // can't because it's called through a delegate. // Therefore, the 2 is never const folded into // doComputation(). auto ans = divSum(2); } The issue I'm dealing with in std.parallelism is conceptually the same as this, but with much more context that's irrelevant to this discussion. Would the following be a feasible compiler optimization either in the near future or at least in principle: When an outer function is inlined, all non-static inner functions should be recompiled with the information gained by inlining the outer function. In this case doComputation() would be recompiled with divisor const-folded to 2 and the division optimized to a shift. This post-inlining compilation would then be passed to runDelegate(). Also, is there any trick I'm not aware of to work around the standard compilation model and force this behavior now?
Low-Lock Singletons In D
On the advice of Walter and Andrei, I've written a blog article about the low-lock Singleton pattern in D. This is a previously obscure pattern that uses thread-local storage to make Singletons both thread-safe and efficient and was independently invented by at least me and Alexander Terekhov, an IBM researcher. However, D's first-class treatment of thread-local storage means the time has come to move it out of obscurity and possibly make it the standard way to do Singletons. Article: http://davesdprogramming.wordpress.com/2013/05/06/low-lock-singletons/ Reddit: http://www.reddit.com/r/programming/comments/1droaa/lowlock_singletons_in_d_the_singleton_pattern/
Re: From C++14 and Java 1.8
On Sunday, 21 April 2013 at 12:08:54 UTC, bearophile wrote: Arrays#parallelSort uses Fork/Join framework introduced in Java 7 to assign the sorting tasks to multiple threads available in the thread pool. This is called eating your own dog food. Fork/Join implements a work stealing algorithm where in a idle thread can steal tasks queued up in another thread. An overview of Arrays#parallelSort: The method uses a threshold value and any array of size lesser than the threshold value is sorted using the Arrays#sort() API (i.e sequential sorting). And the threshold is calculated considering the parallelism of the machine, size of the array and is calculated as: private static final int getSplitThreshold(int n) { int p = ForkJoinPool.getCommonPoolParallelism(); int t = (p 1) ? (1 + n / (p 3)) : n; return t MIN_ARRAY_SORT_GRAN ? MIN_ARRAY_SORT_GRAN : t; } Once its decided whether to sort the array in parallel or in serial, its now to decide how to divide the array in to multiple parts and then assign each part to a Fork/Join task which will take care of sorting it and then another Fork/Join task which will take care of merging the sorted arrays. The implementation in JDK 8 uses this approach: - Divide the array into 4 parts. - Sort the first two parts and then merge them. - Sort the next two parts and then merge them. And the above steps are repeated recursively with each part until the size of the part to sort is not lesser than the threshold value calculated above. I think it's worth adding something similar as strategy of std.algorithm.sort. FWIW, I created a parallel sort in D a while back using std.parallelism. It was part of std.parallel_algorithm, a half-finished project that I abandoned because I was disappointed at how poorly most of it was scaling in practice, probably due to memory bandwidth. If you have some expensive-to-compare types, though, it may be worthwhile. https://github.com/dsimcha/parallel_algorithm/blob/master/parallel_algorithm.d
Re: From C++14 and Java 1.8
On Sunday, 21 April 2013 at 13:30:32 UTC, bearophile wrote: dsimcha: I abandoned because I was disappointed at how poorly most of it was scaling in practice, probably due to memory bandwidth. Then do you know why the Java version seems to be advantageous (with four cores)? Bye, bearophile I don't know Java very well, but possiblities include: 1. Sorting using a virtual or otherwise non-inlined comparison function. This makes the sorting require much more CPU time but not a lot more memory bandwidth. It does beg the question, though, of why the comparison function isn't inlined, especially since modern JITs can sometimes inline virtual functions. 2. Different hardware than I tested on, maybe with better memory bandwidth. 3. Expensive comparison functions. I didn't test this in D either because I couldn't think of a good use case. I tested the D parallel sort using small primitive types (ints and floats and stuff).
Command Line Order + Linker Errors
I'm running into some inexplicable linker errors when trying to compile a project. I've tried two command lines to compile the project that I thought were equivalent except for the names of the output files: // emptymain.d: void main(){} // test.d: unittest { double[double] weights = [1:1.2, 4:2.3]; import std.stdio; writeln(PASSED); } dmd -unittest emptymain.d test.d // Linker errors dmd -unittest test.d emptymain.d // Works Additionally, the linker errors only occur under a custom version of druntime. Don't try to reproduce them under the stock version. (For the curious, it's the precise heap scanning fork from https://github.com/rainers/druntime/tree/precise_gc2 . I'm trying to get precise heap scanning ready for prime time.) My real question, though, is why should the order of these files on the command line matter and does this suggest a compiler or linker bug?
Re: Command Line Order + Linker Errors
The mesasges are below. The exact messages are probably not useful but I included them since you asked. I meant to specify, though, that they're all undefined reference messages. Actually, none of these issues occur at all when compilation of the two files is done separately, regardless of what order the object files are passed to DMD for linking: dmd -c -unittest test.d dmd -c -unittest emptymain.d dmd -unittest test.o emptymain.o # Works dmd -unittest emptymain.o test.o # Works emptymain.o:(.data._D68TypeInfo_S6object26__T16AssociativeArrayTdTdZ16AssociativeArray4Slot6__initZ+0x80): undefined reference to `_D11gctemplates77__T11RTInfoImpl2TS6object26__T16AssociativeArrayTdTdZ16AssociativeArray4SlotZ11RTInfoImpl2yG2m' emptymain.o:(.data._D73TypeInfo_S6object26__T16AssociativeArrayTdTdZ16AssociativeArray9Hashtable6__initZ+0x80): undefined reference to `_D11gctemplates82__T11RTInfoImpl2TS6object26__T16AssociativeArrayTdTdZ16AssociativeArray9HashtableZ11RTInfoImpl2yG2m' emptymain.o:(.data._D69TypeInfo_S6object26__T16AssociativeArrayTdTdZ16AssociativeArray5Range6__initZ+0x80): undefined reference to `_D11gctemplates78__T11RTInfoImpl2TS6object26__T16AssociativeArrayTdTdZ16AssociativeArray5RangeZ11RTInfoImpl2yG2m' emptymain.o:(.data._D149TypeInfo_S6object26__T16AssociativeArrayTdTdZ16AssociativeArray5byKeyMFNdZS6object26__T16AssociativeArrayTdTdZ16AssociativeArray5byKeyM6Result6Result6__initZ+0x80): undefined reference to `_D11gctemplates86__T11RTInfoImpl2TS6object26__T16AssociativeArrayTdTdZ16AssociativeArray5byKeyM6ResultZ11RTInfoImpl2yG2m' emptymain.o:(.data._D153TypeInfo_S6object26__T16AssociativeArrayTdTdZ16AssociativeArray7byValueMFNdZS6object26__T16AssociativeArrayTdTdZ16AssociativeArray7byValueM6Result6Result6__initZ+0x80): undefined reference to `_D11gctemplates88__T11RTInfoImpl2TS6object26__T16AssociativeArrayTdTdZ16AssociativeArray7byValueM6ResultZ11RTInfoImpl2yG2m' emptymain.o: In function `_D11gctemplates66__T6bitmapTS6object26__T16AssociativeArrayTdTdZ16AssociativeArrayZ6bitmapFZG2m': test.d:(.text._D11gctemplates66__T6bitmapTS6object26__T16AssociativeArrayTdTdZ16AssociativeArrayZ6bitmapFZG2m+0x1b): undefined reference to `_D11gctemplates71__T10bitmapImplTS6object26__T16AssociativeArrayTdTdZ16AssociativeArrayZ10bitmapImplFPmZv' On Monday, 29 October 2012 at 21:08:52 UTC, David Nadlinger wrote: On Monday, 29 October 2012 at 20:56:02 UTC, dsimcha wrote: My real question, though, is why should the order of these files on the command line matter and does this suggest a compiler or linker bug? What exactly are the errors you are getting? My first guess would be templates (maybe the precise GC RTInfo ones?) – determining which template instances to emit into what object files is non-trivial, and DMD is currently known to contain a few related bugs. The fact that the problem also appears when compiling all source files at once is somewhat special, though. David
Re: RFC: Pinning interface for the GC
We already have a NO_MOVE attribute that can be set or unset. What's wrong with that? http://dlang.org/phobos/core_memory.html#NO_MOVE On Saturday, 13 October 2012 at 18:58:27 UTC, Alex Rønne Petersen wrote: Hi, With precise garbage collection coming up, and most likely compacting garbage collection in the future, I think it's time we start thinking about an API to pin garbage collector-managed objects. A typical approach that people use to 'pin' objects today is to allocate a chunk of memory from the C heap, add it as a root [range], and store a reference in it. That, or just global variables. This is kind of terrible because adding the chunk of memory as a root forces the GC to actually scan it, which is unnecessary when what you really want is to pin the object in place and tell the GC I know what I'm doing, don't touch this. I propose the following functions in core.memory.GC: static bool pin(const(void)* p) nothrow; static bool unpin(const(void)* p) nothrow; The pin function shall pin the object pointed to by p in place such that it is not allowed to be moved nor collected until unpinned. The function shall return true if the object was successfully pinned or false if the object was already pinned or didn't belong to the garbage collector in the first place. The unpin function shall unpin the object pointed to by p such that it is once again eligible for moving and collection as usual. The function shall return true if the object was successfully unpinned or false if the object was not pinned or didn't belong to the garbage collector in the first place. Destroy!
Re: GC statistics
On Wednesday, 10 October 2012 at 19:35:33 UTC, Andrei Alexandrescu wrote: This is mostly for GC experts out there - what statistics are needed and useful, yet not too expensive to collect? https://github.com/D-Programming-Language/druntime/pull/236 Andrei I'd like to see mark, sweep and page-freeing time be counted separately so that if overall GC performance is slow, the user can identify where the bottleneck is. For example, mark time will be slow if there's a lot of total memory to be scanned. Sweep time will be slow if there are a lot of blocks allocated, even if they're all small. I'm not sure if this is feasible, though, because it assumes that the GC implementation is mark-sweep. I guess we could name the subcategories something more generic like mark and process marks.
Re: openMP
Ok, I think I see where you're coming from here. I've replied to some points below just to make sure and discuss possible solutions. On Thursday, 4 October 2012 at 16:07:35 UTC, David Nadlinger wrote: On Wednesday, 3 October 2012 at 23:02:25 UTC, dsimcha wrote: Because you already have a system in place for managing these tasks, which is separate from std.parallelism. A reason for this could be that you are using a third-party library like libevent. Another could be that the type of workload requires additional problem knowledge of the scheduler so that different tasks don't tread on each others's toes (for example communicating with some servers via a pool of sockets, where you can handle several concurrent requests to different servers, but can't have two task read/write to the same socket at the same time, because you'd just send garbage). Really, this issue is just about extensibility and/or flexibility. The design of std.parallelism.Task assumes that all values which becomes available at some point in the future are the product of a process for which a TaskPool is a suitable scheduler. C++ has std::future separate from std::promise, C# has Task vs. TaskCompletionSource, etc. I'll look into these when I have more time, but I guess what it boils down to is the need to separate the **abstraction** of something that returns a value later (I'll call that **abstraction** futures) from the **implementation** provided by std.parallelism (I'll call this **implementation** tasks), which was designed only with CPU-bound tasks and multicore in mind. On the other hand, I like std.parallelism's simplicity for handling its charter of CPU-bound problems and multicore parallelism. Perhaps the solution is to define another Phobos module that models the **abstraction** of futures and provide an adapter of some kind to make std.parallelism tasks, which are a much lower-level concept, fit this model. I don't think the **general abstraction** of a future should be defined in std.parallelism, though. std.parallelism includes parallelism-oriented things besides tasks, e.g. parallel map, reduce, foreach. Including a more abstract model of values that become available later would make its charter too unfocused. Maybe using the word callback was a bit misleading, but it callback would be invoked on the worker thread (or by whoever invokes the hypothetical Future.complete(result) method). Probably most trivial use case would be to set a condition variable in it in order to implement a waitAny(Task[]) method, which waits until the first of a set of tasks is completed. Ever wanted to wait on multiple condition variables? Or used select() with multiple sockets? This is what I mean. Well, implementing something like ContinueWith or Future.complete for std.parallelism tasks would be trivial, and I see how waitAny could easily be implemented in terms of this. I'm not sure I want to define an API for this in std.parallelism, though, until we have something like a std.future and the **abstraction** of a future is better-defined. For more advanced/application-level use cases, just look at any use of ContinueWith in C#. std::future::then() is also proposed for C++, see e.g. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3327.pdf. I didn't really read the the N3327 paper in detail, but from a brief look it seems to be a nice summary of what you might want to do with tasks/asynchronous results – I think you could find it an interesting read. I don't have time to look at these right now, but I'll definitely look at them sometime soon. Thanks for the info.
Re: openMP
On Thursday, 4 October 2012 at 16:07:35 UTC, David Nadlinger wrote: For more advanced/application-level use cases, just look at any use of ContinueWith in C#. std::future::then() is also proposed for C++, see e.g. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3327.pdf. I didn't really read the the N3327 paper in detail, but from a brief look it seems to be a nice summary of what you might want to do with tasks/asynchronous results – I think you could find it an interesting read. David Thanks for posting this. It was an incredibly useful read for me! Given that the code I write is generally compute-intensive, not I/O intensive, I'd never given much thought to the value of futures in I/O intensive code before this discussion. I stand by what I said before: Someone (not me because I'm not intimately familiar with the use cases; you might be qualified) should write a std.future module for Phobos that properly models the **abstraction** of a future. It's only tangentially relevant to std.parallelism's charter, which includes both a special case of futures that's useful to SMP parallelism and other parallel computing constructs. Then, we should define an adapter that allows std.parallelism Tasks to be modeled more abstractly as futures when necessary, once we've nailed down what the future **abstraction** should look like.
Re: openMP
Unless we're using different terminology here, futures are just std.parallelism Tasks. On Wednesday, 3 October 2012 at 10:17:41 UTC, Nick Sabalausky wrote: On Wed, 03 Oct 2012 09:08:47 +0100 Russel Winder rus...@winder.org.uk wrote: Now that C++ has made the jump to using futures and asynchronous function calls as an integral part of the language, Speaking of, do we have futures in D yet? IIRC, way back last time I asked about it there was something that needed taken care of first, though I don't remember what. If we don't have them ATM, is there currently anything in the way of actually creating them?
Re: openMP
Ok, now I vaguely remember seeing stuff about futures in your Thrift code and wondering why it was there. I'm a little big confused about what you want. If I understand correctly, std.parallelism can already do it pretty easily, but maybe the docs need to be improved a little to make it obvious how to. All you have to do is something like this: auto createFuture() { auto myTask = task!someFun(); // Returns a _pointer_ to a Task. taskPool.put(myTask); // Or myTask.executeInNewThread(); // A task created with task() can outlive the scope it was created in. // A scoped task, created with scopedTask(), cannot. This is safe, // since myTask is NOT scoped and is a _pointer_ to a Task. return myTask; } In this case myTask is already running using the execution resources specified in createFuture(). Does this do what you wanted? If so, I'll clarify the documentation. If not, please clarify what you needed and the relevant use cases so that I can fix std.parallelism. On Wednesday, 3 October 2012 at 15:50:38 UTC, David Nadlinger wrote: On Wednesday, 3 October 2012 at 14:10:57 UTC, dsimcha wrote: Unless we're using different terminology here, futures are just std.parallelism Tasks. No, std.parallelism.Tasks are not really futures – they offer a constrained [1] future interface, but couple this with the notion that a Task can be executed at some point on a TaskPool chosen by the user. Because of this, I had to implement my own futures for the Thrift async stuff, where I needed a future as a promise [2] by an invoked entity that it kicked off a background activity which will eventually return a value, but which the users can't »start« or choose to »execute it now«, as they can with Tasks. If TaskPool had an execute() method which took a delegate to execute (or a »Task«, for that matter) and returned a new object which serves as a »handle« with wait()/get()/… methods, _that_ would (likely) be a future. David [1] Constrained in the sense that it is only meant for short-/synchronous-running tasks and thus e.g. offer no callback mechanism. [2] Let's not get into splitting hairs regarding the exact meaning of »Future« vs. »Promise«, especially because C++11 introduced a new interpretation to the mix.
Re: openMP
On Wednesday, 3 October 2012 at 21:02:07 UTC, David Nadlinger wrote: On Wednesday, 3 October 2012 at 19:42:07 UTC, dsimcha wrote: If not, please clarify what you needed and the relevant use cases so that I can fix std.parallelism. In my use case, conflating the notion of a future, i.e. a value that becomes available at some point in the future, with the process which creates that future makes no sense. So the process which creates the future is a Task that executes in a different thread than the caller? And an alternative way that a value might become available in the future is e.g. if it's being retrieved from some slow I/O process like a database or network? For example, let's say you are writing a function which computes a complex database query from its parameters and then submits it to your query manager/connection pool/… for asynchronous execution. You cannot use std.parallelism.Task in this case, because there is no way of expressing the process which retrieves the result as a delegate running inside a TaskPool. Ok, I'm confused here. Why can't the process that retrieves the result be expressed as a delegate running in a TaskPool or a new thread? Or, say you want to write an aggregator, combining the results of several futures together, again offering the same future interface (maybe an array of the original result types) to consumers. Again, there is no computation-bound part to that at all, which would make sense to run on a TaskPool – you are only waiting on the other tasks to finish. Maybe I'm just being naive since I don't understand the use cases, but why couldn't you just create an array of Task objects? The second problem with std.parallelism.Task is that your only choice is polling (or blocking, for that matter). Yes, callbacks are a hairy thing to do if you can't be sure what thread they are executed on, but not having them severely limits the power of your abstraction, especially if you are dealing with non-CPU-bound tasks (as many of today's modern use cases are). I'm a little confused about how the callbacks would be used here. Is the idea that some callback would be called when the task is finished? Would it be called in the worker thread or the thread that submitted the task to the pool? Can you provide a use case? For example, something my mentor asked to implement for Thrift during last year's GSoC was a feature which allows to send a request out to a pool of servers concurrently, returning the first one of the results (apparently, this mechanism is used as a sharding mechanism in some situations – if a server doesn't have the data, it simply ignores the request). First one of the results == the result produced by the the first server to return anything? How would you implement something like that as a function Task[] - Task? For what it's worth, Task in C# (which is quite universally praised for its take on the matter) also has a »ContinueWith« method which is really just a completion callback mechanism. I'll look into ContinueWith and see if it's implementable in std.parallelism without breaking anything. std.parallelism.Task is great for expressing local resource-intensive units of work (and fast!), but I think it is to rigid and specialized for that case to be generally useful. Right. I wrote std.parallelism with resource-intensive units of work in mind because that's the use case I was familiar with. It was designed first and foremost to make using SMP parallelism _simple_. In hindsight I might have erred to much on the side of making simple things simple vs. complicated things possible or over-specialized it and avoided solving the an important, more general problem. I'll try to understand your use cases and see if they can be addressed without making simple things more complicated. I think the best way you could help me understand what I've overlooked in std.parallelism's design is to give a quick n' dirty example of how an API that does what you want would be used. Even more generally, any _concise, concrete_ use cases, even toy use cases, would be a huge help.
Re: Status on Precise GC
On Sunday, 9 September 2012 at 16:51:15 UTC, Jacob Carlborg wrote: On 2012-09-08 23:35, Tyler Jameson Little wrote: Awesome, that's good news. I'd love to test it out, but I've never built the D runtime (or Phobos for that matter) from source. Are there any instructions or do I just do something like make sudo make install and it'll put itself in the right places? FWIW, I'm running Linux with the standard DMD 2.060 compiler. Just run: make -f posix.mak Or, for Windows: make -f win32.mak You also need to build Phobos, which automatically links the druntime objects into a single library file, by going into the Phobos directory and doing the same thing. An annoying issue on Windows, though, is that DMD keeps running out of memory when all the precise GC teimplates are instantiated. I've been meaning to rewrite the make file to separately compile Phobos on Windows, but I've been preoccupied with other things.
Re: Status on Precise GC
Here's the GSoC project I mentored this summer. A little integration work still needs to be done, and I've been meaning to ping the student about the status of this. If you want, I'd welcome some beta testers. https://github.com/Tuna-Fish/druntime/tree/gc_poolwise_bitmap On Saturday, 8 September 2012 at 01:55:44 UTC, Tyler Jameson Little wrote: This issue on bugzilla hasn't been updated since July 2011, but it's assigned to Sean Kelly: http://d.puremagic.com/issues/show_bug.cgi?id=3463 I've found these threads concerning a precise GC: http://www.digitalmars.com/d/archives/digitalmars/D/learn/Regarding_the_more_precise_GC_35038.html http://www.digitalmars.com/d/archives/digitalmars/D/How_can_I_properly_import_functions_from_gcx_in_object.di_171815.html Is this issue obsolete, or is it being worked on? Reason being, I'm writing a game in D and I plan to write it in nearly 100% D (with the exception being OpenGL libraries and the like), but I know I'll run into problems with the GC eventually. If this is an active project that may get finished in the relative near term (less than a year), then I'd feel comfortable knowing that eventually problems may go away. I want to eventually make this work with ARM (Raspberry PI cubieboard), and the GC is a major blocker here (well, and a cross-compiler, but I'll work that out when I get there). I'm using dmd atm if that matters. Thanks! Jameson
Re: Phobos unittest failure on single-core machines
On Friday, 24 August 2012 at 02:16:24 UTC, Ed McCardell wrote: When trying to run the phobos unittests on my 32- and 64-bit linux single-processor machines, I get this output: Testing generated/linux/debug/64/unittest/std/parallelism totalCPUs = 1 core.exception.AssertError@std.parallelism(4082): unittest failure Has anyone else seen this, or is possible that I have an error in my dmd setup? (I'm using dmd/druntime/phobos from git HEAD, building in what I thought was the normal manner). --Ed McCardell This looks to be a bug in a recently-added feature. I'll look at it in detail tonight, but I think I know what the problem is and it's pretty easy to fix. Can you please file a Bugzilla and note whether it always occurs or is non-deterministic?
Re: Antti-Ville Tuuainen Passes GSoC Final Evaluation
On Thursday, 23 August 2012 at 11:40:22 UTC, Rory McGuire wrote: On Thu, Aug 23, 2012 at 4:01 AM, Chad J chadjoan@__spam.is.bad__gmail.comwrote: Poolwise bitmap... what an interesting name. I'll look forward to learning about the concepts behind it! +1 Basically, the idea is to store information about what is and isn't a pointer at the pool level instead of at the block level. My attempt from a long time ago at precise heap scanning, and Antti-Ville's first attempt, stored meta-data at the end of every allocated block. This worked well for large arrays, but was terribly inefficient for smaller allocations and made the GC code even messier than it already is. The overhead was a fixed (void*).sizeof bits per block. Now, each pool has a bit array that contains one bit for every possible aligned pointer. The overhead is always 1 bit for every (void*).sizeof bytes no matter how large or small the block is.
Re: Antti-Ville Tuuainen Passes GSoC Final Evaluation
On Thursday, 23 August 2012 at 14:38:19 UTC, Alex Rønne Petersen wrote: Yes, but parallelization of the mark phase is fairly trivial, and something we should probably look into. Ironically, Antti-ville's original proposal involved parallelization. This was scrapped because after rtinfo was added, we agreed that precise heap scanning was more important and looked newly feasible.
Antti-Ville Tuuainen Passes GSoC Final Evaluation
Congratulations, Antti-Ville! This project creates a better implementation of precise GC heap scanning than anything that's been created so far for D. The goal is to eventually integrate it into standard D distributions. Any volunteers for beta testing? Code: https://github.com/Tuna-Fish/druntime/tree/gc_poolwise_bitmap The code for this project is a fork of druntime. The master branch was a failed (or less successful) experiment. The version we're going with for integration is the gc_poolwise_bitmap branch.
Re: Fragile ABI
On Thursday, 16 August 2012 at 14:58:23 UTC, R Grocott wrote: C++'s fragile ABI makes it very difficult to write class libraries without some sort of workaround. For example, RapidXML and AGG are distributed as source code; GDI+ is a header-only wrapper over an underlying C interface; and Qt makes heavy use of the Pimpl idiom, which makes its source code much more complex than it needs to be. This is also a major problem for any program which wants to expose a plugin API. Since pimpl is useful but messy, given D's metaprogramming capabilities, maybe what we need is a Pimpl template in Phobos: // The implementation struct. struct SImpl { int a, b, c; void fun() {} } // Automatically generate code for the Pimpl wrapper. alias Pimpl!SImpl S; auto s = new S; On the other hand, IIUC Pimpl doesn't solve the vtable part of the problem, only the data members part. (Correct me if I'm wrong here, since I admit to knowing very little about the fragile ABI problem or its workarounds.)
Re: Which D features to emphasize for academic review article
On Sunday, 12 August 2012 at 03:30:24 UTC, bearophile wrote: Andrei Alexandrescu: - The language's superior modeling power and level of control comes at an increase in complexity compared to languages such as e.g. Python. So the statistician would need a larger upfront investment in order to reap the associated benefits. Statistician often use the R language (http://en.wikipedia.org/wiki/R_language ). Python contains much more computer science and CS complexity compared to R. Not just advanced stuff like coroutines, metaclasses, decorators, Abstract Base Classes, operator overloading, and so on, but even simpler things, like generators, standard library collections like heaps and deques, and so on. For some statisticians I've seen, even several parts of Python are too much hard to use or understand. I have rewritten several of their Python scripts. Bye, bearophile For people with more advanced CS/programming knowledge, though, this is an advantage of D. I find Matlab and R incredibly frustrating to use for anything but very standard matrix/statistics computations on data that's already structured the way I like it. This is mostly because the standard CS concepts you mention are at best awkward and at worst impossible to express and, being aware of them, I naturally want to take advantage of them. Using Matlab or R feels like being forced to program with half the tools in my toolbox either missing or awkwardly misshapen, so I avoid it whenever practical. (Actually, languages like C and Java that don't have much modeling power feel the same way to me now that I've primarily used D and to a lesser extent Python for the past few years. Ironically, these are the languages that are easy to integrate with R and Matlab respectively. Do most serious programmers who work in problem domains relevant to Matlab and R feel this way or is it just me?). This was my motivation for writing Dstats and mentoring Cristi's fork of SciD. D's modeling power is so outstanding that I was able to replace R and Matlab for a lot of use cases with plain old libraries written in D.
Re: Which D features to emphasize for academic review article
On Monday, 13 August 2012 at 01:52:28 UTC, Joseph Rushton Wakeling wrote: The main use-case and advantage of both R and MATLAB/Octave seems to me to be the plotting functionality -- I've seen some exceptionally beautiful stuff done with R in particular, although I've not personally explored its capabilities too far. The annoyance of R in particular is the impenetrable thicket of dependencies that can arise among contributed packages; it feels very much like some are thrown over the wall and then built on without much concern for organization. :-( I've addressed that, too :). https://github.com/dsimcha/Plot2kill Obviously this is a one-man project without nearly the same number of features that R and Matlab have, but like Dstats and SciD, it has probably the 20% of functionality that handles 80% of use cases. I've used it for the figures in scientific articles that I've submitted for publication and in my Ph.D. proposal and dissertation. Unlike SciD and Dstats, Plot2kill doesn't highlight D's modeling capabilities that much, but it does get the job done for simple 2D plots.
Re: MPI Concurrency library update?
All I have is a very ad-hoc wrapper that does just what I needed for my purposes. It basically has function prototypes for the parts of the API I actually care about and a few high-level wrappers for passing primitives and arrays to other nodes of the same architecture. On Saturday, 11 August 2012 at 01:12:29 UTC, Andrew wrote: On Saturday, 11 August 2012 at 00:24:40 UTC, dsimcha wrote: On Friday, 10 August 2012 at 23:40:43 UTC, Andrew Spott wrote: A while ago, (August of last year I think), there was talk about a MPI wrapper for D. Has there been any update on that? I was considering writing one, but I wanted it to be high-level and easy-to-use. I ended up not doing it, initially because I was waiting for serialization to be added to Phobos (which I thought was imminent) and then because I got busy with unrelated things. I think that a nice high-level MPI wrapper for D should be tightly integrated into a serialization library to encapsulate the low-level details of passing non-trivial data structures across nodes. I doubt I'll get around to implementing it when serialization is added, though, because I'm probably past the MPI-using stage of my life (my Ph.D. research is basically finished, I'm just revising my dissertation and preparing to defend) so I wouldn't get to eat my own dogfood. Well, my PhD research is just beginning... :) Any chance you could pass on what you have? It might help me out a bit, and reduce my workload toward creating a usable MPI library. Thanks. -Andrew
Re: MPI Concurrency library update?
On Friday, 10 August 2012 at 23:40:43 UTC, Andrew Spott wrote: A while ago, (August of last year I think), there was talk about a MPI wrapper for D. Has there been any update on that? I was considering writing one, but I wanted it to be high-level and easy-to-use. I ended up not doing it, initially because I was waiting for serialization to be added to Phobos (which I thought was imminent) and then because I got busy with unrelated things. I think that a nice high-level MPI wrapper for D should be tightly integrated into a serialization library to encapsulate the low-level details of passing non-trivial data structures across nodes. I doubt I'll get around to implementing it when serialization is added, though, because I'm probably past the MPI-using stage of my life (my Ph.D. research is basically finished, I'm just revising my dissertation and preparing to defend) so I wouldn't get to eat my own dogfood.
Re: Which D features to emphasize for academic review article
Ok, so IIUC the audience is academic BUT is people interested in using D as a means to an end, not computer scientists? I use D for bioinformatics, which IIUC has similar requirements to econometrics. From my point of view: I'd emphasize the following: Native efficiency. (Important for large datasets and monte carlo simulations) Garbage collection. (Important because it makes it much easier to write non-trivial data structures that don't leak memory, and statistical analyses are a lot easier if the data is structured well.) Ranges/std.range/builtin arrays and associative arrays. (Again, these make data handling a pleasure.) Templates. (Makes it easier to write algorithms that aren't overly specialized to the data structure they operate on. This can also be done with OO containers but requires more boilerplate and compromises on efficiency.) Disclaimer: These last two are things I'm the primary designer and implementer of. I intentionally put them last so it doesn't look like a shameless plug. std.parallelism (Important because you can easily parallelize your simulation, etc.) dstats (https://github.com/dsimcha/dstats Important because a lot of statistical analysis code is already implemented for you. It's admittedly very basic compared to e.g. R or Matlab, but it's also in many cases better integrated and more efficient. I'd say that it has the 15% of the functionality that covers ~70% of use cases. I welcome contributors to add more stuff to it. I imagine economists would be interested in time series, which is currently a big area of missing functionality.)
Review Queue: Should We Start Reviews Again?
Apparently nothing's been getting reviewed for inclusion in Phobos lately, and the review queue has once again become fairly long according to the wiki (http://prowiki.org/wiki4d/wiki.cgi?ReviewQueue). I also noticed that a new XML library is already in the review queue. Is there some reason why none of this stuff is being reviewed? If std.xml2 is really ready for review, we should review this ASAP since XML processing is fundamental functionality for a modern standard library in a high-level language. IIRC the current std.xml's inadequacy was a major complaint that Jacob Carolberg, the author of std.serialize, had. Perhaps when/if std.xml2 is accepted, he should modify std.serialize to use it, and std.serialize should be next in the queue.
Hiatus, Improving Participation in D Development
I've been on somewhat of a hiatus these past few months and have only worked on D-related development sporadically. There are several reasons for my absence, some of which will hopefully change soon, and I hope to make a comeback. Below are the reasons why my contributions have declined and some suggestions for improvements to the D community where the issues aren't specific to me: 0. There's a lot less stuff that's broken or missing now than a few years ago when I started contributing. This has led to a mild complacency as D is already awesome. For example, it's been a long time since I hit a compiler bug that caused me significant hassle. 1. I'm writing my Ph.D. thesis and looking for jobs. I still have some time to contribute, but D development isn't the top idea in my mind due to these distractions. This is in some ways the root of the problem, as I have less time and mental energy to keep up with D through informal channels. I think my job search is over, though, so that's one less distraction. 2. Because I'm writing my thesis, I don't program much for my research anymore. I therefore don't notice little things that are broken or get cool ideas from other languages as often. To make it easier for someone to find bugs and enchancement requests in areas he/she is already familiar with, I'd like to see an ability to search by module (for Phobos/druntime) or feature (for DMD) in Bugzilla. For example, in Phobos I'm most familiar with std.range, std.algorithm, std.parallelism and std.functional. I'd like to be able to easily query a list of bugs specific to those modules. 3. As the community has grown larger, more people besides me have stepped up. Of course this is a good thing. The only downside is that I've lost track of who's working on what, what its status is, what still needs to be done, and what the holdups are. Perhaps we need some central place other than this newsgroup where this information is posted for key D projects. For example, we'd have a page that says Person X is working on a new std.xml. Here's the Github repository. It's stalled because noone can agree on a design for Y. We should also maintain a slightly more formal wishlist of stuff that noone's working on that's waiting to be done. BTW, if noone is working on a new std.xml anymore, I might want to start. I interviewed for a job where they wanted me to do a small prototype as part of the hiring process that involved parsing XML. I was allowed to use any language I wanted. I think my D projects played a large role in me getting an offer, but I couldn't use it for the prototype because std.xml is so bad. I ended up using Python instead. 4. The release cycle has slowed greatly. What happened here? The 1-2 month release cycles were a good motivator because they created mild deadline pressure to get features and fixes checked in before the next release. 5. The amount of stuff on this forum and the mailing lists has become overwhelming. I've recently remedied this to a small degree by unsubscribing from dmd-internals. I've never been a contributor to the compiler itself and had only subscribed to this list to track bug fixes and 64-bit support implementation. Now, the signal-to-noise ratio of my inbox is good enough that I actually read the Phobos and druntime stuff again instead of just glossing over all my D-related email. As far as this forum, I suggest a split something like the following, so that it has a better signal-to-noise ratio from the perspective of people with specific interests: D.language-design: Long, complicated threads about language features and the theory behind them belong here. D.phobos-design: Since the Phobos mailing list is intended mostly for regular contributors and is focused on individual pull requests and commits, this is where high-level design stuff would get discussed w.r.t. Phobos. D.ecosystem: Stuff about third-party libraries, Deimos, toolchains, etc. goes here. D.adoption: Discussions about why D was or wasn't adopted for a given project and how to increase its adoption go here. D.learn: Questions about using D. We already have this, but we need to encourage people to use it more instead of posting to the main group.
Re: Hiatus, Improving Participation in D Development
On 7/15/2012 4:54 AM, Jonathan M Davis wrote: On Sunday, July 15, 2012 00:02:09 dsimcha wrote: BTW, if noone is working on a new std.xml anymore, I might want to start. I interviewed for a job where they wanted me to do a small prototype as part of the hiring process that involved parsing XML. I was allowed to use any language I wanted. I think my D projects played a large role in me getting an offer, but I couldn't use it for the prototype because std.xml is so bad. I ended up using Python instead. Someone was working on it (Tomaz?) and was supposedly making good progress, but last time I checked, they hadn't posted anything since some time in 2010. So, as far as I can tell, that project is effectively dead. I have no idea what state it was in before it stalled or whether the code is available anywhere online. I expect that anyone who wants to work on it will either have to start from scratch or grabbing one of the existing xml parsers floating around and adjust it (though I suspect that if it's going to be range-based like it's supposed to be that any existing parsers floating around probably would need quite a bit of work to get the right API, but I don't know). It's the sort of thing that I'd love to work on given the time, but I have so much else going on that it would be ridiculous for me to even consider it. If you want to take up that baton, then I think that's great. Even if you end up taking a while to do it, that's better than getting nothing and seeing no progress as we have been for quite some time now. - Jonathan M Davis Ok, well I'll at least try to get a better handle on what's involved and how much time I'm going to have over the next few months. I'm not saying I definitely want to take it on yet.
Re: Hiatus, Improving Participation in D Development
On Sunday, 15 July 2012 at 15:46:38 UTC, David Nadlinger wrote: On Sunday, 15 July 2012 at 04:02:48 UTC, dsimcha wrote: 5. The amount of stuff on this forum and the mailing lists has become overwhelming. I've recently remedied this to a small degree by unsubscribing from dmd-internals. I've never been a contributor to the compiler itself and had only subscribed to this list to track bug fixes and 64-bit support implementation. Now, the signal-to-noise ratio of my inbox is good enough that I actually read the Phobos and druntime stuff again instead of just glossing over all my D-related email. I take it you are referring to the GitHub commit messages which are relayed to dmd-internals? Because except for those (which I just made a filter rule for), the list is really quite low-volume. Maybe we should create a dedicated d-commits list to which all the GitHub notifications get sent, similar to what other projects have? The occasional post-commit discussion could then be continued on one of the repository-specific lists, just like they are now. David Yeah. The problem is that for a while, D mailing lists became so overwhelming that I got into the habit of reflexively ignoring them completely due to poor signal-to-noise ratio w.r.t stuff I actually work on and being preoccupied with other things. Your idea may be a good one, since only the core DMD devs care about every commit but others might want to participate in higher level discussions.
Antti-Ville Tuuainen passes his midterm evaulations for GSoC 2012
Congratulations to Antti-Ville Tuuainen for passing the GSoC 2012 midterm evaluation! Despite going through a steep learning curve to learn D's template metaprogramming system, Antti-Ville has precise heap scanning for the garbage collector close to working using the new rtinfo template that has been added to object.d. His Github repository is at: https://github.com/Tuna-Fish/druntime The plans for the second half include creating an alternative implementation of precise scanning that may be more efficient and removing the global lock from malloc() if time permits.
Re: Rational numbers in D
A long time ago, this was discussed on this forum. I wrote the current candidate for std.rational, and there was talk of Don Clugston integrating the GCD function into std.bigint to take advantage of knowing BigInt's internals. According to Don, using a general algorithm here results in terrible performance. As of now, that hasn't happened, though. On 6/7/2012 1:49 PM, Joseph Rushton Wakeling wrote: Sorry for the double-post -- I already asked this in d-learn, but this may be a better place to ask. What's the current state of affairs and roadmap for inclusion of rational number support in D? I've come across David Simcha's work: http://cis.jhu.edu/~dsimcha/d/phobos/std_rational.html and a feature request on the bugzilla: http://d.puremagic.com/issues/show_bug.cgi?id=7885 but this isn't mentioned at all in the review queue: http://prowiki.org/wiki4d/wiki.cgi?ReviewQueue What's the status of work/planning for this feature and is there any kind of ETA for when it might land in Phobos? Thanks and best wishes, -- Joe
Re: run-time stack-based allocation
On 5/7/2012 12:08 PM, Gor Gyolchanyan wrote: Wasn't there an allocator mechanism under development for phobos? I remember there was a StackAllocator, that can span for arbitrary scopes. What's up with that? I wrote one. It's at https://github.com/dsimcha/TempAlloc . It hasn't been accepted to Phobos, though, because of issues w.r.t. figuring out what a more general allocator interface should look like.
Re: GC API: What can change for precise scanning?
On 4/18/2012 6:46 PM, Sean Kelly wrote: Leandro's GC (CDGC) is already set up to support precise scanning. It's in the Druntime git repository, but lacks the features added to the Druntime GC compared to the Tango GC on which CDGC is based. Still, it may be easier to update CDGC based on a diff between the Druntime and Tango GC than it would to add precise scanning to the GC Druntime currently uses. Worth a look if anyone is interested anyway. Or, failing that, I can look at it to get ideas about how to handle various annoying plumbing issues. The plumbing issues (i.e. getting the GCInfo pointers from the allocation routines into the guts of the GC) are actually the hard part of this project. Once the GC has the the GCInfo pointer, making it use that for precise scanning is trivial in that I've done it before and remember roughly how I did it.
GC API: What can change for precise scanning?
Now that the compiler infrastructure has been implemented, I've gotten busy figuring out how to make D's default GC precise. As a first attempt, I think I'm going to adapt my original solution from http://d.puremagic.com/issues/show_bug.cgi?id=3463 since it's simple and it works except that there previously was no clean way to get the offset info into the GC. As Walter pointed out in another thread, the GCInfo template is allowed to instantiate to data instead of a function. IMHO unless/until major architectural changes to the GC are made that require a function pointer, there's no point in adding this indirection. I started working on this and I ran into a roadblock. I need to know what parts of the GC API are allowed to change, and discuss how to abstract away the implementation of it from the GC API. I assume the stuff in core.memory needs to stay mostly the same, though I guess we would need to add a setType() function that takes a pointer into a block of memory and a TypeInfo object and changes how the GC interprets the bits in the block. In gc.d, we define a bunch of extern(C) functions and the proxy thing. Since we've given up on the idea of swapping precise GCs at link time, can I just rip out all this unnecesary indirection? If not, is it ok to change some of these signatures? I definitely want to avoid allocating (requiring the GC lock) and then calling a function to set the type (requiring another lock acquisition) so the signature of malloc(), etc. needs to change somewhere. More generally, what is the intended way to get GCInfo pointers from TypeInfo into the guts of the GC where they can be acted on?
Re: compiler support added for precise GC
On 4/15/2012 10:24 PM, Walter Bright wrote: Just checked it in. Of course, it doesn't actually do precise GC, it is just thrown over the wall for the library devs who are itching to get started on it. Excellent!! Maybe I'll get started on this soon.
Re: Mono-D GSoC proposal, hopefully the last thread about it
Yeah, as a mentor, I will reassure both of you that no news definitely isn't bad news and may even be good news. If you don't have any feedback, it's because we either haven't gotten around to reading your proposal yet or it had all the information we wanted and don't have any requests for clarification, etc. Don't read too much into the lack of feedback or get discouraged.
Re: D for a Qt developer
On 3/31/2012 4:23 PM, Davita wrote: One general comment: Lots of people ask for the stuff you're asking for. Progress is being made on all the relevant fronts, slowly but surely. 1) Database libs/ORMs. I think Steve Teale is working on something for this, but I don't know the details or how much progress is being made. 2) mature UI library (vector based ,declarative or at least to support styling like Qt stylesheet). I think QtD is now usable since the relevant compiler bugs were ironed out. 3) Crypto libs for hashing and with asymmetric algorithm implementations. You would probably be best off linking to a C library for this. The headers are in Deimos. https://github.com/D-Programming-Deimos/openssl 4) XML libraries for generating and parsing xml docs. Although XSD validation support and XSL transforms. Phobos has a pretty rudimentary XML lib. Tango's been ported to D2, though. You could try it. https://github.com/SiegeLord/Tango-D2 5) networking libs with several main protocol implementations such as Http, FTP and SMTP. std.net.curl was just added to the latest Phobos release. 6) and of course, RAD styled IDE. Visual D might do what you want. Those are the minimum of my requirements in order to start development for a platform. So guys, what do you think, will D be useful for me? :-) P.S. what happened to Qt bindings? I saw that it was abandoned. Maybe working with trolltech/Nokia team to integrate D in QtCreator and creating and maintening Qt's D bindings would be the most awesome decision, but how achievable is it? :) I personally don't use QtD, so I don't know where it's hosted, but a lot of stuff that was on dsource has moved to Github. If it looks abandoned on dsource, it may have been migrated.
Re: GSoC: Linear Algebra and the SciD library
Cullen, I think the ideas page sums it up pretty well. Matrix factorizations, sparse matrices and general polish and bug fixing are the main goals I had in mind, though we're definitely open to any other ideas you may have. As someone with a strong math background, you could add a lot of value by helping us figure out what features are worth adding in addition to just implementing the features that have been previously suggested. Unfortunately, though, SciD uses template metaprogramming very heavily. If you're not comfortable with template metaprogramming in either C++ or D (you imply that you have no experience with either language) then you'd need to get up to speed very quickly. The project will have almost zero chance of success if you don't master templates. If this sounds too difficult, we still encourage you to submit a proposal for another project that doesn't use templates or other advanced, D-specific features so heavily. --David Simcha
Re: Three Unlikely Successful Features of D
1. Scope guards. The need to execute some code at the end of a scope is pervasive and has led to about a zillion workarounds from gotos in C to RAII in C++ to try/finally in Java and C# to with statements in Python. D is the only language I know of that lets the programmer specify such a simple and common intention directly. 2. CTFE. There's a certain elegance to having most of the language available at compile time and it seems like there's no shortage of creative uses for this. 3. Static if. This is the most important feature for converting template metaprogramming from an esoteric form of sorcery to a practical, readable tool for the masses. Most of the compile time introspection D provides would be almost unusable without it.
Re: We have a GSoC mentor already: David Simcha
On 3/2/2012 9:38 PM, Trass3r wrote: Am 03.03.2012, 00:43 Uhr, schrieb Andrei Alexandrescu seewebsiteforem...@erdani.org: David Simcha applied for a second gig as a GSoC mentor. Needless to say, his application was approved :o). Please join me in welcoming him! Yay! Time to ask about the status of the last GSoC project, i.e. the LinAlg one. If it still needs lots of work, maybe there could be another round on that. The status is that the debugging and polishing is slowly happening. I've used the library for real work and while rough around the edges, it's quite good. I've also worked on adding some Lapack wrapper stuff that Cristi (the student I mentored) didn't get to and improving the test suite. The key todos are: 1. Fix a few nasty bugs that are more design flaws than run-of-the-mill bugs. This is hard for me to do without Cristi's input because I am unclear on a few design decisions. 2. Documentation. 3 More LAPACK wrappers. 4. More real-world testing. I'm not comfortable submitting something this large and complicated for Phobos inclusion until a few people have used it extensively for real work and found all the bugs and design flaws. 5. Some serious profiling/performance optimization. 6. Get allocators into Phobos, since Cristi's SciD fork depends on them.
Re: We have a GSoC mentor already: David Simcha
On 3/3/2012 2:04 AM, Andrei Alexandrescu wrote: Mentors are chosen before students and projects. As we all know, David has a variety of interests, with scientific programming at the top. Andrei I'm open to a variety of projects, but scientific computing and garbage collection are at the top of my list.
Re: Inheritance of purity
On Friday, 17 February 2012 at 03:24:50 UTC, Jonathan M Davis wrote: No. Absolutely not. I hate the fact that C++ does this with virtual. It makes it so that you have to constantly look at the base classes to figure out what's virtual and what isn't. It harms maintenance and code understandability. And now you want to do that with @safe, pure, nothrow, and const? Yuck. I can understand wanting to save some typing, but I really think that this harms code maintainability. It's the sort of thing that an IDE is good for. It does stuff like generate the function signatures for you or fill in the attributes that are required but are missing. Besides the fact that not everyone uses an IDE, my other counter-argument to these the IDE generates your boilerplate arguments is that code is read and modified more often than it is written. I don't like reading or modifying boilerplate code any more than I like writing it. Besides, if you're using a fancy IDE, can't it show you the protection attributes inherited from the derived class?
Re: GSoC will open soon
On 2/6/2012 12:34 AM, Andrei Alexandrescu wrote: Yah, a wiki page sounds great. Andrei Wiki is up, ice is broken. Let's start adding some ideas! I also think this year we should have a possible mentors line next to each project to keep track of who's interested in mentoring what. For example, I added garbage collection to the page. If you're also interested in mentoring a GC project, just append yourself to the list of possible mentors. http://prowiki.org/wiki4d/wiki.cgi?GSOC_2012_Ideas
Re: Message passing between threads: Java 4 times faster than D
I wonder how much it helps to just optimize the GC a little. How much does the performance gap close when you use DMD 2.058 beta instead of 2.057? This upcoming release has several new garbage collector optimizations. If the GC is the bottleneck, then it's not surprising that anything that relies heavily on it is slow because D's GC is still fairly naive. On Thursday, 9 February 2012 at 15:44:59 UTC, Sean Kelly wrote: So a queue per message type? How would ordering be preserved? Also, how would this work for interprocess messaging? An array-based queue is an option however (though it would mean memmoves on receive), as are free-lists for nodes, etc. I guess the easiest thing there would be a lock-free shared slist for the node free-list, though I couldn't weigh the chance of cache misses from using old memory blocks vs. just expecting the allocator to be fast. On Feb 9, 2012, at 6:10 AM, Gor Gyolchanyan gor.f.gyolchan...@gmail.com wrote: Generally, D's message passing is implemented in quite easy-to-use way, but far from being fast. I dislike the Variant structure, because it adds a huge overhead. I'd rather have a templated message passing system with type-safe message queue, so no Variant is necessary. In specific cases Messages can be polymorphic objects. This will be way faster, then Variant. On Thu, Feb 9, 2012 at 3:12 PM, Alex Dovhal alex dov...@yahoo.com wrote: Sorry, my mistake. It's strange to have different 'n', but you measure speed as 1000*n/time, so it's doesn't matter if n is 10 times bigger. -- Bye, Gor Gyolchanyan.
Re: [xmlp] the recent garbage collector performance improvements
On Thursday, 2 February 2012 at 04:38:49 UTC, Robert Jacques wrote: An XML parser would probably want some kind of stack segment growth schedule, which, IIRC isn't supported by RegionAllocator. I had considered putting that in RegionAllocator but I was skeptical of the benefit, at least assuming we're targeting PCs and not embedded devices. The default segment size is 4MB. Trying to make the initial size any smaller won't save much memory. Four megabytes is also big enough that new segments would be allocated so infrequently that the cost would be negligible. I concluded that the added complexity wasn't justified.
Re: [xmlp] the recent garbage collector performance improvements
On Thursday, 2 February 2012 at 18:06:24 UTC, Manu wrote: On 2 February 2012 17:40, dsimcha dsim...@yahoo.com wrote: On Thursday, 2 February 2012 at 04:38:49 UTC, Robert Jacques wrote: An XML parser would probably want some kind of stack segment growth schedule, which, IIRC isn't supported by RegionAllocator. at least assuming we're targeting PCs and not embedded devices. I don't know about the implications of your decision, but comment makes me feel uneasy. I don't know how you can possibly make that assumption? Have you looked around at the devices people actually use these days? PC's are an endangered and dying species... I couldn't imagine a worse assumption if it influences the application of D on different systems. I'm not saying that embedded isn't important. It's just that for low level stuff like memory management it requires a completely different mindset. RegionAllocator is meant to be fast and simple at the expense of space efficiency. In embedded you'd probably want completely different tradeoffs. Depending on how deeply embedded, space efficiency might be the most important thing. I don't know exactly what tradeoffs you'd want, though, since I don't do embedded development. My guess is that you'd want something completely different, not RegionAllocator plus a few tweaks that would complicate it for PC use. Therefore, I designed RegionAllocator for PCs with no consideration for embedded environments.
Re: [xmlp] the recent garbage collector performance improvements
On Thursday, 2 February 2012 at 18:55:02 UTC, Andrej Mitrovic wrote: On 2/2/12, Manu turkey...@gmail.com wrote: PC's are an endangered and dying species... Kind of like when we got rid of cars and trains and ships once we started making jumbo jets. Oh wait, that didn't happen. Agreed. I just recently got my first smartphone and I love it. I see it as a complement to a PC, though, not as a substitute. It's great for when I'm on the go, but when I'm at home or at work I like a bigger screen, a full keyboard, a faster processor, more memory, etc. Of course smartphones will get more powerful but I doubt any will ever have dual 22 inch monitors.
Re: [xmlp] the recent garbage collector performance improvements
Interesting. I'm glad my improvements seem to matter in the real world, though I'm thoroughly impressed with the amount of speedup. Even the small allocation benchmark that I was optimizing only sped up by ~50% from 2.057 to 2.058 overall and ~2x in collection time. I'd be very interested if you could make a small, self-contained test program to use as a benchmark. GC performance is one of D's biggest weak spots, so it would probably be a good bit of marketing to show that the performance is substantially better than it used to be even if it's not great yet. Over the past year I've been working on and off at speeding it up. It's now at least ~2x faster than it was last year at this time on every benchmark I've tried and up to several hundred times faster in the extreme case of huge allocations. On Wednesday, 1 February 2012 at 18:33:58 UTC, Richard Webb wrote: Last night I tried loading a ~20 megabyte xml file using xmlp (via the DocumentBuilder.LoadFile function) and a recent dmd build, and found that it took ~48 seconds to complete, which is rather poor. I tried running it through a profiler, and that said that almost all the runtime was spent inside the garbage collector. I then tried the same test using the latest Git versions of dmd/druntime (with pull request 108 merged in), and that took less than 10 seconds. This is a rather nice improvement, though still somewhat on the slow side. Some profiler numbers, if anyone is interested: Old version: Gcxfullcollect: 31.14 seconds, 69.26% runtime. Gcxmark: 4.84 seconds, 10.77% runtime. Gcxfindpool: 2.10 seconds, 4.67% runtime. New version: Gcxmark: 11.67 seconds, 50.77% runtime. Gcxfindpool: 3.58 seconds, 15.55% runtime. Gcxfullcollect: 1.69 seconds, 7.37% runtime. (Assuming that Sleepy is giving me accurate numbers. The new version is definately faster though).
Re: [xmlp] the recent garbage collector performance improvements
On Wednesday, 1 February 2012 at 22:53:11 UTC, Richard Webb wrote: For reference, the file i was testing with has ~5 root nodes, each of which has several children. The number of nodes seems to have a much larger effect on the speed that the amount of data. Sounds about right. For very small allocations sweeping time dominates the total GC time. You can see the breakdown at https://github.com/dsimcha/druntime/wiki/GC-Optimizations-Round-2 . The Tree1 benchmark is the very small allocation benchmark. Sweeping takes time linear in the number of memory blocks allocated and, for blocks 1 page, constant time in the size of the blocks.
Re: [xmlp] the recent garbage collector performance improvements
On Wednesday, 1 February 2012 at 23:43:24 UTC, H. S. Teoh wrote: Out of curiosity, is there a way to optimize for the many small allocations case? E.g., if a function allocates, as temporary storage, a tree with a large number of nodes, which becomes garbage when it returns. Perhaps a way to sweep the entire space used by the tree in one go? Not sure if such a thing is possible. T My RegionAllocator is probably the best thing for this if the lifetime is deterministic as you describe. I rewrote the Tree1 benchmark using RegionAllocator a while back just for comparison. D Tree1 + RegionAllocator had comparable speed to a Java version of Tree1 run under HotSpot. (About 6 seconds on my box vs. in the low 30s for Tree1 with the 2.058 GC.) If all the objects are going to die at the same time but not at a deterministic time, you could just allocate a big block from the GC and place class instances in it using emplace().
Re: [xmlp] the recent garbage collector performance improvements
On Thursday, 2 February 2012 at 01:27:44 UTC, bearophile wrote: Richard Webb: Parsing the file with DMD 2.057 takes ~25 seconds Parsing the file with DMD 2.058(Git) takes ~6.1 seconds Parsing the file with DMD 2.058, with the GC disabled during the LoadFile call, takes ~2.2 seconds. For comparison, MSXML6 takes 1.6 seconds to load the same file. Not too much time ago Python devs have added an heuristic to the Python GC (that is a reference counter + cycle breaker), it switches off if it detects the program is allocating many items in a short time. Is it possible to add something similar to the D GC? Bye, bearophile I actually tried to add something like this a while back but I couldn't find a heuristic that worked reasonably well. The idea was just to create a timeout where the GC can't run for x milliseconds after it just ran.
Re: [xmlp] the recent garbage collector performance improvements
Wait a minute, since when do we even have a std.xml2? I've never heard of it and it's not in the Phobos source tree (I just checked). On Thursday, 2 February 2012 at 00:41:31 UTC, Richard Webb wrote: On 01/02/2012 19:35, dsimcha wrote: I'd be very interested if you could make a small, self-contained test program to use as a benchmark. The 'test' is just / import std.xml2; void main() { string xmlPath = rtest.xml; auto document = DocumentBuilder.LoadFile(xmlPath, false, false); } / It's xmlp that does all the work (and takes all the time). I'll see about generating a simple test file, but basically: 5 top level nodes each one has 6 child nodes each node has a single attribute, and the child nodes each have a short text value. Parsing the file with DMD 2.057 takes ~25 seconds Parsing the file with DMD 2.058(Git) takes ~6.1 seconds Parsing the file with DMD 2.058, with the GC disabled during the LoadFile call, takes ~2.2 seconds. For comparison, MSXML6 takes 1.6 seconds to load the same file.
Re: Call site 'ref'
On 1/15/2012 8:36 AM, Alex Rønne Petersen wrote: Hi, I don't know how many times I've made the mistake of passing a local variable to a function which takes a 'ref' parameter. Suddenly, local variables/fields are just mutating out of nowhere, because it's not at all obvious that a function you're calling is taking a 'ref' parameter. This is particularly true for std.utf.decode(). Yes, I realize I could look at the function declaration. Yes, I could read the docs too. But that doesn't prevent me from forgetting that a function takes a 'ref' parameter, and then doing the mistake again. The damage is done, and the time is wasted. I think D should allow 'ref' on call sites to prevent these mistakes. For example: string str = ...; size_t pos; auto chr = std.utf.decode(str, ref pos); Now it's much more obvious that the parameter is passed by reference and is going to be mutated. Ideally, this would not be optional, but rather *required*, but I realize that such a change would break a *lot* of code, so that's probably not a good idea. Thoughts? This would break UFCS severely. The following would no longer work: auto arr = [1, 2, 3, 4, 5]; arr.popFront(); // popFront takes arr by ref
Re: Discussion about D at a C++ forum
On 1/9/2012 2:56 AM, Gour wrote: On Sun, 08 Jan 2012 19:26:15 -0500 dsimchadsim...@yahoo.com wrote: As someone who does performance-critical scientific work in D, this comment is absolutely **wrong** because you only need to avoid the GC in the most performance-critical/realtime parts of your code, i.e. where you should be avoiding any dynamic allocation, GC or not. Considering we'd need to do some work for our project involving number crunching in the form of producing several libs to be (later) used by GUI part of the app, I'm curious to know do you use ncurses or just plain console output for your UI? Pure command line/console.
Re: Discussion about D at a C++ forum
On 1/8/2012 6:28 PM, Mehrdad wrote: On 1/7/2012 10:57 PM, Jonathan M Davis wrote: Not exactly the most informed discussion. Well, some of their comments _ARE_ spot-on correct... 2. While you can avoid the garbage collector, that basically means you can't use most of the standard library. Looks pretty darn correct to me -- from the fixed-size array literal issue (literals are on the GC heap), to all the string operations (very little is usable), to associative arrays (heck, they're even part of the language, but you can't use them without a GC), etc... As someone who does performance-critical scientific work in D, this comment is absolutely **wrong** because you only need to avoid the GC in the most performance-critical/realtime parts of your code, i.e. where you should be avoiding any dynamic allocation, GC or not. (Though GC is admittedly worse than malloc, at least given D's current quality of implementation.) My style of programming in D is to consciously transition between high-level D and low-level D depending on what I'm doing. Low-level D avoids the GC, heavy use of std.range/std.algorithm since the compiler doesn't optimize these well yet, and basically anything else where the cost isn't clear. It's a PITA to program in like all low-level languages, but not as bad as C or C++. Nonetheless low-level D is just as fast as C or C++. High-level D is slower than C or C++ but faster than Python, and integrates much more cleanly with low-level D than Python does with C and C++. It's only slightly harder to program in than Python. Bottom line: D doesn't give you a free lunch but it does give you a cheaper lunch than C, C++ or even a combination of C/C++ and Python.
Removing the Lock for Small GC Allocations: Clarification of GC Design?
I have a plan to avoid the GC lock for most small (1 page) GC allocations. I hope to have a pull request within a week or two, in time for the next release. There's one detail I need clarified by Sean, Walter or someone who designed the D GC. Currently small allocations are handled by popping a block off a free list, if a block is available. I plan to make each page owned by a single thread, and make the free lists thread-local. The array of free lists (one for each power of two size) is stored in the Gcx struct. The easiest way to make this array thread-local is to move it out of the Gcx struct and make it global. Is there any reason why 1 instance of Gcx would exist (maybe as an implementation detail of shared libraries, etc.)? If not, what's to point of having the Gcx struct instead of just making its variables global?
Re: CURL Wrapper: Congratulations Next up: std.serialize
On Wednesday, 28 December 2011 at 16:01:50 UTC, Jacob Carlborg wrote: Running the unit tests: ./unittest.sh Use make to compile the library or create an executable using rdmd. A few things to think about that need to be resolved: * This is quite a large library and I really don't want to put it all into one module. I'm hoping it will be OK with a package So the package would be std.serialize? * I would really like to keep the unit tests in their own modules because they're quite large and the modules are already large without the unit tests in them Sounds reasonable. It goes against the Phobos convention, but it sounds like you have a good reason to. * The unit tests use a kind of mini-unit test framework. Should that be kept or removed? I haven't looked at it yet, but if it's generally useful, maybe it should be extracted and exposed as part of Phobos. I'd say keep it for now but keep it private, and later make a proposal for a full review to make it a public, official part of Phobos. Note: The documentation is generate using D1, I don't think that should make a difference though.
Re: A nice way to step into 2012
On Tuesday, 27 December 2011 at 15:19:07 UTC, dsimcha wrote: On Tuesday, 27 December 2011 at 15:11:25 UTC, Andrei Alexandrescu wrote: Imagine how bitter I am that the string lambda syntax didn't catch on! Andrei Please tell me they're not going anywhere. I **really** don't want to deal with those being deprecated. ...and they were kind of useful in that you could introspect the string and apply optimizations depending on what the lambda was. I wrote a sorting function that introspected the lambda that was passed to it. If it was a b, ab, a b, etc., and the array to be sorted was floating point, it punned and bit twiddled the floats/doubles to ints/longs, sorted them and bit twiddled and punned them back.
Re: A nice way to step into 2012
On Tuesday, 27 December 2011 at 15:11:25 UTC, Andrei Alexandrescu wrote: Imagine how bitter I am that the string lambda syntax didn't catch on! Andrei Please tell me they're not going anywhere. I **really** don't want to deal with those being deprecated.
Re: Looking for SciLib
On Monday, 26 December 2011 at 10:11:01 UTC, Lars T. Kyllingstad wrote: So submitting Cristi's library for inclusion in Phobos is now off the table? -Lars In the _near_ future, yes. It's still too much of a work in progress. Submitting to Phobos is still the eventual goal, though.
CURL Wrapper: Congratulations Next up: std.serialize
By a vote of 14-0, Jonas Drewsen's CURL wrapper (std.net.curl) has been accepted into Phobos. Thanks to Jonas for his hard work and his persistence through the multiple rounds of review that it took to get this module up to Phobos's high and increasing quality standard. Keep the good work coming. Next in line, if it's ready, is Jacob Carlborg's std.serialize. Jacob, please post here when you've got something ready to go.
Re: Looking for SciLib
On Monday, 26 December 2011 at 00:46:44 UTC, Jonathan M Davis wrote: Sounds like they should probably be merged at some point. - Jonathan M Davis Yeah, I've started working on Cristi's fork now that I've built a good enough mental model of the implementation details that I can modify the code. This fork is still very much a work in progress. A merge with Lars's code is a good idea at some point, but right now debugging and fleshing out the linalg stuff is a higher priority.
Re: Binary Size: function-sections, data-sections, etc.
Indeed, a couple small programs I wrote today behave erratically w/ gc-sections. This only seems to occur on DMD, but I'm not sure if this is a bug in DMD or if differences in library build configurations between compilers (these are workarounds for bugs in GDC and LDC) explain it. On Wednesday, 21 December 2011 at 04:15:21 UTC, Artur Skawina wrote: On 12/20/11 19:59, Trass3r wrote: Seems like --gc-sections _can_ have its pitfalls: http://blog.flameeyes.eu/2009/11/21/garbage-collecting-sections-is-not-for-production Also I read somewhere that --gc-sections isn't always supported (no standard switch or something like that). The scenario in that link apparently involves a hack, where a completely unused symbol is used to communicate with another program/library (which checks for its presence with dlsym(3)). The linker will omit that symbol, as nothing else references it - the solution is to simply reference it from somewhere. Or explicitly place it in a used section. Or incrementally link in the unused symbols _after_ the gc pass. Or... If you use such hacks you have to handle them specially; there's no way for the compiler to magically know which unreferenced symbols are not really unused. (which is also why this optimization isn't very useful for shared libs - every visible symbol has to be assumed used, for obvious reasons) The one potential problematic case i mentioned in that gdc bug mentioned above is this: If the D runtime (most likely GC) needs to know the start/end of the data and bss sections _and_ does it in a way that can confuse it if some unreferenced parts of these sections disappear and/or are reordered, then turning on the section GC could uncover this bug. From the few simple tests i ran here everything seems to work fine, but I did not check the code to confirm there are no incorrect assumptions present. I personally see no reason not to use -ffunction-sections and -fdata-sections for compiling phobos though, cause a test with gdc didn't even result in a much bigger lib file, nor did it take significantly longer to compile/link. 737k - 320k executable size reduction is a compelling argument. That site I linked claims though, that it does mean serious overhead even if --gc-sections is omitted then. ? So we have to do tests with huge codebases first. yes. artur
auto + Top-level Const/Immutable
The changes made to IFTI in DMD 2.057 are great, but they reveal another hassle with getting generic code to play nice with const. import std.range, std.array; ElementType!R sum(R)(R range) { if(range.empty) return 0; auto ans = range.front; range.popFront(); foreach(elem; range) ans += elem; return ans; } void main() { const double[] nums = [1, 2, 3]; sum(nums); } test.d(8): Error: variable test9.sum!(const(double)[]).sum.ans cannot modify const test.d(14): Error: template instance test9.sum!(const(double)[]) error instantiating Of course this is fixable with an Unqual, but it requires the programmer to remember this every time and breaks for structs with indirection. Should we make `auto` also strip top-level const from primitives and arrays and, if const(Object)ref gets in, from objects?
Re: Top C++
On Tuesday, 20 December 2011 at 15:21:46 UTC, deadalnix wrote: http://www.johndcook.com/blog/2011/06/14/why-do-c-folks-make-things-so-complicated/ Sounds a lot like SafeD vs. non-safe D.
Binary Size: function-sections, data-sections, etc.
I started poking around and examining the details of how the GNU linker works, to solve some annoying issues with LDC. In the process I the following things that may be useful low-hanging fruit for reducing binary size: 1. If you have an ar library of object files, by default no dead code elimination is apparently done within an object file, or at least not nearly as much as one would expect. Each object file in the ar library either gets pulled in or doesn't. 2. When something is compiled with -lib, DMD writes libraries with one object file **per function**, to get around this. GDC and LDC don't. However, if you compile the object files and then manually make an archive with the ar command (which is common in a lot of build processes, such as gtkD's), this doesn't apply. 3. The defaults can be overridden if you compile your code with -ffunction-sections and -fdata-sections (DMD doesn't support this, GDC and LDC do) and link with --gc-sections. -ffunction-sections and -fdata-sections cause each function or piece of static data to be written as its own section in the object file, instead of having one giant section that's either pulled in or not. --gc-sections garbage collects unused sections, resulting in much smaller binaries especially when the sections are fine-grained. On one project I'm working on, I compiled all the libs I use with GDC using -ffunction-sections -fdata-sections. The stripped binary is 5.6 MB when I link the app without --gc-sections, or 3.5 MB with --gc-sections. Quite a difference. The difference would be even larger if Phobos were compiled w/ -ffunction-sections and -fdata-sections. (See https://bitbucket.org/goshawk/gdc/issue/293/ffunction-sections-fdata-sections-for ). DMD can't compile libraries with -ffunction-sections or -fdata-sections and due to other details of my build process that are too complicated to explain here, the results from DMD aren't directly comparable to those from GDC. However, --gc-sections reduces the DMD binaries from 11 MB to 9 MB. Bottom line: If we want to reduce D's binary size there are two pieces of low-hanging fruit: 1. Make -L--gc-sections the default in dmd.conf on Linux and probably other Posix OS's. 2. Add -ffunction-sections and -fdata-sections or equivalents to DMD and compile Phobos with these enabled. I have no idea how hard this would be, but I imagine it would be easy for someone who's already familiar with object file formats.
Re: auto + Top-level Const/Immutable
On Tuesday, 20 December 2011 at 17:46:40 UTC, Jonathan M Davis wrote: Assuming that the assignment can still take place, then making auto infer non- const and non-immutable would be an improvement IMHO. However, there _are_ cases where you'd have to retain const - a prime example being classes. But value types could have const/immutable stripped from them, as could arrays using their tail-constness. - Jonathan M Davis Right. The objects would only be head de-constified if Michael Fortin's patch to allow such things got in. A simple way of explaining this would be auto removes top level const from the type T if T implicitly converts to the type that would result.
Re: Program size, linking matter, and static this()
On Tuesday, 20 December 2011 at 20:51:38 UTC, Marco Leise wrote: Am 19.12.2011, 20:43 Uhr, schrieb Jacob Carlborg d...@me.com: On Windows I see few applications that install libraries separately, unless they started on Linux or the libraries are established like DirectX. In the past DLLs from newly installed programs used to overwrite existing DLLs. IIRC the DLLs were then checked for their versions by installers, so they are never downgraded, but that still broke some applications with library updates that changed the API. Starting with Vista, there is the winsxs difrectory that - as I understand it - keeps a copy of every version of every dll associated to the programs that installed/use them. Minor nitpick: winsxs has been around since XP.
Re: Looking for SciLib
On 12/20/2011 5:58 PM, filgood wrote: or this?...seems contain further development https://github.com/cristicbz/scid I mentored that GSoC project, so since people appear to be interested in it I'll give a status report. The GSoC project was left in a somewhat rough/half-finished state at the end of GSoC because of several unanticipated problems and the ambitious (possibly overly so) nature of the project. Cristi (the GSoC student I mentored) and I have been slowly improving things since the end of GSoC but both of us have limited time to work on it. From GSoC we got a solid set of matrix/vector containers and an expression template system. AFAIK there are no major issues with these. The expression template system supports addition, subtraction, multiplication and division/inversion with matrices, vectors and scalars. The expression template evaluator works well for general matrix storage, but is badly broken for packed matrices (e.g. triangular, symmetric, diagonal). Fixing this is time consuming but is a simple matter of programming. I'm starting to implement Lapack wrappers for common matrix factorizations in scid.linalg. These are tedious to write because of the obtuseness of the Lapack API and the need to support both a high-level interface and one that allows very explicit memory management. Again, though, it's a simple matter of programming. There are a few performance problems that need to be ironed out (though perhaps I should do some benchmarking before I claim so boldly that these problems are serious). See https://github.com/cristicbz/scid/issues/77 . After the above issues are resolved, I think the next thing on the roadmap would be to start building on this foundation to add support for higher level scientific computing stuff. For example, I have a bunch of statistics/machine learning code (https://github.com/dsimcha/dstats) that was written before SciD existed. I'm slowly integrating with SciD's current foundation and will probably merge it with SciD once SciD is more stable and bug-free.
Re: Reducing Linker Bugs
On 12/19/2011 12:54 AM, Walter Bright wrote: On 12/18/2011 8:38 PM, dsimcha wrote: Two questions: 1. What's the best way to file a bug report against Optlink when I get one of those Optlink terminated unexpectedly windows and I'm linking in libraries that I don't have the source code to and thus can't reduce? In that case, the best thing is to zip it all up and file a bugzilla report on it. Do you need the sources or just the object/library binaries?
Re: Reducing Linker Bugs
The OMF library that I don't have the source to is a BLAS/LAPACK stub library that calls into a DLL. It was uploaded ~5 years ago to DSource by Bill Baxter. I know absolutely no details about how he compiled it. On Monday, 19 December 2011 at 18:04:26 UTC, Walter Bright wrote: On 12/19/2011 5:51 AM, dsimcha wrote: On 12/19/2011 12:54 AM, Walter Bright wrote: On 12/18/2011 8:38 PM, dsimcha wrote: Two questions: 1. What's the best way to file a bug report against Optlink when I get one of those Optlink terminated unexpectedly windows and I'm linking in libraries that I don't have the source code to and thus can't reduce? In that case, the best thing is to zip it all up and file a bugzilla report on it. Do you need the sources or just the object/library binaries? For linker problems, doan need no steenkin' sources. BTW, optlink is known to have problems with weak extern records. Where did your omf libraries come from?
Re: Java Scala
On Monday, 19 December 2011 at 19:52:41 UTC, ddverne wrote: On Sunday, 18 December 2011 at 07:09:21 UTC, Walter Bright wrote: A programmer who doesn't know assembler is never going to write better than second rate programs. Please I don't want to flame this thread or anything like that, but this isn't a lack of modesty or a little odd? The phrase: Who never wrote anything in ASM will not make a firt-rate program is a bit odd, because for me it's like say: A programmer who never programs on punched cards will never going to write a first-rate program. Finally, what I mean is: Saying that will bring something good for the community? Or should a new programmer would stop his D programming studies and start with Assembly. That misses the point. Assembly language teaches the fundamentals of how a computer works at a low level. It's similar to learning Lisp in that it makes you better able to reason about programming even if you never actually program in it. The only difference is that Lisp stretches your reasoning ability towards the highest abstraction levels, assembly language does it for the lowest levels. Programming on punchcards is equivalent to typing: It is/was sometimes a necessary practical skill, but there's nothing conceptually deep about it that makes it worth learning even if it's not immediately practical.
Re: Java Scala
On 12/18/2011 2:14 AM, Russel Winder wrote: Python is also used in industry and commerce, so it is not just a teaching language. Almost all post-production software uses C++ and Python. Most HPC is now Fortran, C++ and Python. This latter would be a great area for D to try and break into, but sadly I don't hink it would now be possible. Please elaborate. I think D for HPC is a terrific idea. It's the only language I know of with all of the following four attributes: 1. Allows you to program straight down to the bare metal with zero or close to zero overhead, like C and C++. 2. Interfaces with C and Fortran legacy code with minimal or no overhead. 3. Has modern convenience/productivity features like GC, a real module system and structural typing via templates. 4. Has support for parallelism in the standard library. (I'm aware of OpenMP, but in my admittedly biased opinion std.parallelism is orders of magnitude more flexible and easier to use.)
Re: Java Scala
On 12/18/2011 2:09 AM, Walter Bright wrote: A programmer who doesn't know assembler is never going to write better than second rate programs. I don't even know assembler that well and I agree 100%. I can read bits of assembler and recognize compiler optimizations and could probably mechanically translate C code to x86 assembler, but I'd be lost if asked to write anything more complicated than a small function from scratch or do anything without some reference material. Even this basic level of knowledge has given me insights into language design. For example: I'd love to be asked in an interview whether default arguments to virtual functions are determined by the compile time or runtime type of the object. To someone who knows nothing about assembler this seems like the most off-the-wall language-lawyer minutiae imaginable. To someone who knows assembler, the answer is obviously the compile time type. Otherwise, you'd have to store the function's default arguments in the virtual function table somehow, then look each one up and push it onto the stack at the call site. This would get very hairy and inefficient very fast.
Reducing Linker Bugs
Two questions: 1. What's the best way to file a bug report against Optlink when I get one of those Optlink terminated unexpectedly windows and I'm linking in libraries that I don't have the source code to and thus can't reduce? 2. I'm getting on the Optlink hating bandwagon. How hard would it be to use Objconv (http://www.agner.org/optimize/#objconv) to convert the OMF object files DMD outputs to COFF and then use the MinGW linker to link the COFF files, and automate this process in DMD?
[Issue 7130] New: NRVO Bug: Wrong Code With D'tor + Conditional Return
The NG server was down when I submitted this to Bugzilla and it's a pretty important issue, so I'm posting it to the NG manually now: http://d.puremagic.com/issues/show_bug.cgi?id=7130 import core.stdc.stdio; struct S { this(this) { printf(Postblit\n); } ~this() { printf(D'tor\n); } } S doIt(int i) { S s1; S s2; printf(s1 lives at %p.\n, s1); printf(s2 lives at %p.\n, s2); return (i == 42) ? s1 : s2; } void main() { auto s = doIt(3); printf(s lives at %p.\n, s); } Output: s1 lives at 0xffc54368. s2 lives at 0xffc54369. D'tor D'tor s lives at 0xffc5437c. D'tor Both D'tors are called and the returned result lives at a different address after being returned than before, as expected if not using NRVO. On the other hand, no postblit being called for whichever struct is returned, as expected if using NRVO.
CURL Wrapper: Vote Thread
The time has come to vote on the inclusion of Jonas Drewsen's CURL wrapper in Phobos. Code: https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d Docs: http://freeze.steamwinter.com/D/web/phobos/etc_curl.html For those of you on Windows, a libcurl binary built by DMC is available at http://gool.googlecode.com/files/libcurl_7.21.7.zip. Voting lasts one week and ends on 12/24.
Re: Second Round CURL Wrapper Review
On Tuesday, 13 December 2011 at 00:47:26 UTC, David Nadlinger wrote: I don't know if you already have a solution in the works, but maybe the future interface I did for Thrift is similar to what you are looking for: http://klickverbot.at/code/gsoc/thrift/docs/thrift.util.future.html David Doesn't std.parallelism's task parallelism API work for this? (Roughly speaking a task in std.parallelism == a future in your Thrift API.) If not, what can I do to fix it so that it can? Looking briefly at your API, one thing I notice is the ability to cancel a future. This would be trivial to implement in std.parallelism for tasks that haven't yet started executing, but difficult if not impossible for tasks that are already executing. Does your Thrift API allow cancelling futures that are already executing? If so, how is that accomplished? The TFutureAggregatorRange could be handled by a parallel foreach loop if I understand correctly, though it would look a little different.
Re: Fixing const arrays
On 12/10/2011 4:47 PM, Andrei Alexandrescu wrote: We decided to fix this issue by automatically shedding the top-level const when passing an array or a pointer by value into a function. Really silly question: Why not do the same for primitives (int, float, char, etc.) or even structs without indirection? I've seen plenty of code that blows up when passed an immutable double because it tries to mutate its arguments. About 1.5 years ago I fixed a bug like this in std.math.pow().
Re: A benchmark, mostly GC
On 12/11/2011 4:26 PM, Timon Gehr wrote: We are talking about supporting precise GC, not about custom runtime reflection. There is no way to get precise GC right without compiler support. FWIW my original precise heap scanning patch generated pointer offset information using CTFE and templates. The code to do this is still in Bugzilla and only took a couple hours to write.
Re: Second Round CURL Wrapper Review
Here's my review. Remember, review ends on December 16. Overall, this library has massively improved due to the rounds of review it's been put through. I only found a few minor nitpicks. However, a recurring pattern is minor grammar mistakes in the documentation. Please proofread all documentation again. Docs: The high level API is build - The high level API is built LibCurl is licensed under a MIT/X derivate license - LibCurl is licensed under an MIT/X derivative license AutoConnect: Connection type used when the url should be used to auto detect protocol. - auto detect THE protocol Why is there a link to curl_easy_set_opt in the byLineAsync and byChunkAsync docs? In onSend: The length of the void[] specifies the maximum number of bytes that can be send. - can be SENT What is the use case for exposing struct Curl? I prefer if this were unexposed because we'll obviously be unable to provide a replacement if/when the backend to this library is rewritten in pure D. Actually, that leads to another question: Should this module really be named etc.curl/std.curl/std.net.curl, or should the fact that it currently uses Curl as a backend be relegated to an implementation detail? Code: pragma(lib) basically doesn't work on Linux because the object format doesn't support it. Don't rely on it. Should the protocol detection be case-insensitive, i.e. ftp://; == FTP://?
Re: Second Round CURL Wrapper Review
On 12/11/2011 7:53 PM, dsimcha wrote: Should the protocol detection be case-insensitive, i.e. ftp://; == FTP://? Oh, one more thing: Factor the protocol detection out into a function. You have the same expression cut and pasted everywhere: if(url.startsWith(ftp://;) || url.startsWith(ftps://) ...
Re: A benchmark, mostly GC
On 12/11/2011 9:41 PM, Timon Gehr wrote: On 12/11/2011 11:37 PM, dsimcha wrote: On 12/11/2011 4:26 PM, Timon Gehr wrote: We are talking about supporting precise GC, not about custom runtime reflection. There is no way to get precise GC right without compiler support. FWIW my original precise heap scanning patch generated pointer offset information using CTFE and templates. The code to do this is still in Bugzilla and only took a couple hours to write. But it is not precise for the stack, right? How much work is left to the programmer to generate the information? It wasn't precise on the stack, but for unrelated reasons. As far as work left to the programmer, I has created templates for new (which I thought at the time might get integrated into the compiler). To use the precise heap scanning, all you had to do was: class C { void* ptr; size_t integer; } void main() { auto instance = newTemplate!C(); }
DustMite: Unwrap? Imports?
I've recently started using DustMite to reduce compiler errors in SciD, which instantiates an insane number of templates and is nightmarish to reduce by hand. Two questions: 1. What exactly does unwrap (as opposed to remove) do? 2. When there are multiple imports in a single statement, i.e. import foo, bar;, does DustMite try to get rid of individual ones without deleting the whole statement? Is this what unwrap does?
Re: rt_finalize WTFs?
== Quote from Martin Nowak (d...@dawgfoto.de)'s article I appreciate the recursion during mark, wanted to do this myself sometime ago but expected a little more gain. The reason the gain wasn't huge is because on the benchmark I have that involves a deep heap graph, sweeping time dominates marking time. The performance gain for the mark phase only (which is important b/c this is when the world needs to be stopped) is ~20-30%. Some more ideas: - Do a major refactoring of the GC code, making it less reluctant to changes. Adding sanity checks or unit tests would be great. This probably reveals some obfuscated performance issues. Not just obfuscated ones. I've wanted to fix an obvious perf bug for two years and haven't done it because the necessary refactoring would be unbelievably messy and I'm too afraid I'll break something. Basically, malloc() sets the bytes between the size you requested and the size of the block actually allocated to zero to prevent false pointers. This is reasonable. The problem is that it does so **while holding the GC's lock**. Fixing it for just the case when malloc() is called by the user is also easy. The problem is fixing it when malloc() gets called from realloc(), calloc(), etc. - Add more realistic GC benchmarks, just requires adding to druntime/test/gcbench using the new runbench. The tree1 mainly uses homogeneous classes, so this is very synthesized. I'll crowdsource this. I can't think of any good benchmarks that are a few hundred lines w/ no dependencies but aren't pretty synthetic. - There is one binary search pool lookup for every scanned address in range. Should be a lot to gain here, but it's difficult. It needs a multilevel mixture of bitset/hashtab. I understand the problem, but please elaborate on the proposed solution. You've basically got a bunch of pools, each of which represents a range of memory addresses, not a single address (so a basic hashtable is out). You need to know which range some pointer fits in. How would you beat binary search/O(log N) for this? - Reduce the GC roots range. I will have to work on this for shared library support anyhow. Please clarify what you mean by reduce the roots range. Thanks for the feedback/suggestions.
Re: rt_finalize WTFs?
== Quote from Martin Nowak (d...@dawgfoto.de)'s article More promising is to put pool addresses ranges in a trie. addr[7] [... . ...] / |\ addr[6] [... . ...][... . ...] / |\ / | \ addr[5] pool:8 [... . ...] / | \ addr[4] pool:8 [] pool:5 Actually 64-bit should use a hashtable for the upper 32-bit and then the the 32-bit trie for lower. Why do you expect this to be faster than a binary search? I'm not saying it won't be, just that it's not a home run that deserves a high priority as an optimization. You still have a whole bunch of indirections, probably more than you would ever have for binary search.
Re: rt_finalize WTFs?
On 12/5/2011 6:39 PM, Trass3r wrote: On 05/12/2011 01:46, dsimcha wrote: I'm at my traditional passtime of trying to speed up D's garbage collector again Have you thought about pushing for the inclusion of CDGC at all/working on the tweaks needed to make it the main GC? So true, it's been rotting in that branch. IIRC CDGC includes two major enhancements: 1. The snapshot GC for Linux. (Does this work on OSX/FreeBSD/anything Posix, or just Linux? I'm a bit skeptical about whether a snapshot GC is really that great an idea given its propensity to waste memory on long collect cycles with a lot of mutation.) 2. I think there was some precise heap scanning-related stuff in it. I originally tried to implement precise heap scanning a couple years ago, but it went nowhere for reasons too complicated to explain here. Given this experience, I'm not inclined to try again until the compiler has extensions for generating pointer offset information.
rt_finalize WTFs?
I'm at my traditional passtime of trying to speed up D's garbage collector again, and I've stumbled on the fact that rt_finalize is taking up a ridiculous share of the time (~30% of total runtime) on a benchmark where huge numbers of classes **that don't have destructors** are being created and collected. Here's the code to this function, from lifetime.d: extern (C) void rt_finalize(void* p, bool det = true) { debug(PRINTF) printf(rt_finalize(p = %p)\n, p); if (p) // not necessary if called from gc { ClassInfo** pc = cast(ClassInfo**)p; if (*pc) { ClassInfo c = **pc; byte[]w = c.init; try { if (det || collectHandler is null || collectHandler(cast(Object)p)) { do { if (c.destructor) { fp_t fp = cast(fp_t)c.destructor; (*fp)(cast(Object)p); // call destructor } c = c.base; } while (c); } if ((cast(void**)p)[1]) // if monitor is not null _d_monitordelete(cast(Object)p, det); (cast(byte*) p)[0 .. w.length] = w[]; // WTF? } catch (Throwable e) { onFinalizeError(**pc, e); } finally // WTF? { *pc = null; // zero vptr } } } } Getting rid of the stuff I've marked with //WTF? comments (namely the finally block and the re-initializing of the memory occupied by the finalized object) speeds things up by ~15% on the benchmark in question. Why do we care what state the blob of memory is left in after we finalize it? I can kind of see that we want to clear things if delete/clear was called manually and we want to leave the object in a state that doesn't look valid. However, this has significant performance costs and IIRC is already done in clear() and delete is supposed to be deprecated. Furthermore, I'd like to get rid of the finally block entirely, since I assume its presence and the effect on the generated code is causing the slowdown, not the body, which just assigns a pointer. Is there any good reason to keep this code around?
Re: rt_finalize WTFs?
Thanks for the benchmark. I ended up deciding to just create a second function, rt_finalize_gc, that gets rid of a whole bunch of cruft that isn't necessary in the GC case. I think it's worth the small amount of code duplication it creates. Here are the results of my efforts so far: https://github.com/dsimcha/druntime/wiki/GC-Optimizations-Round-2 . I've got one other good idea that I think will shave a few seconds off the Tree1 benchmark if I don't run into any unforeseen obstacles in implementing it. On 12/4/2011 10:07 PM, Martin Nowak wrote: On Mon, 05 Dec 2011 02:46:27 +0100, dsimcha dsim...@yahoo.com wrote: I'm at my traditional passtime of trying to speed up D's garbage collector again, and I've stumbled on the fact that rt_finalize is taking up a ridiculous share of the time (~30% of total runtime) on a benchmark where huge numbers of classes **that don't have destructors** are being created and collected. Here's the code to this function, from lifetime.d: extern (C) void rt_finalize(void* p, bool det = true) { debug(PRINTF) printf(rt_finalize(p = %p)\n, p); if (p) // not necessary if called from gc { ClassInfo** pc = cast(ClassInfo**)p; if (*pc) { ClassInfo c = **pc; byte[] w = c.init; try { if (det || collectHandler is null || collectHandler(cast(Object)p)) { do { if (c.destructor) { fp_t fp = cast(fp_t)c.destructor; (*fp)(cast(Object)p); // call destructor } c = c.base; } while (c); } if ((cast(void**)p)[1]) // if monitor is not null _d_monitordelete(cast(Object)p, det); (cast(byte*) p)[0 .. w.length] = w[]; // WTF? } catch (Throwable e) { onFinalizeError(**pc, e); } finally // WTF? { *pc = null; // zero vptr } } } } Getting rid of the stuff I've marked with //WTF? comments (namely the finally block and the re-initializing of the memory occupied by the finalized object) speeds things up by ~15% on the benchmark in question. Why do we care what state the blob of memory is left in after we finalize it? I can kind of see that we want to clear things if delete/clear was called manually and we want to leave the object in a state that doesn't look valid. However, this has significant performance costs and IIRC is already done in clear() and delete is supposed to be deprecated. Furthermore, I'd like to get rid of the finally block entirely, since I assume its presence and the effect on the generated code is causing the slowdown, not the body, which just assigns a pointer. Is there any good reason to keep this code around? Not for the try block. With errors being not recoverable you don't need to care about zeroing the vtbl or you could just copy the code into the catch handler. This seems to cause less spilled variables. Most expensive is the call to a memcpy@PLT, replace it with something inlineable. Zeroing is not much faster than copying init[] for small classes. At least zeroing should be worth it, unless the GC would not scan the memory otherwise. gcbench/tree1 = 41.8s = https://gist.github.com/1432117 = gcbench/tree1 = 33.4s Please add useful benchmarks to druntime. martin
Re: gl3n - linear algebra and more for D
I don't know much about computer graphics but I take it that a sane design for a matrix/vector library geared towards graphics is completely different from one geared towards general numerics/scientific computing? I'm trying to understand whether SciD (which uses BLAS/LAPACK and expression templates) overlaps with this at all. On 12/2/2011 5:36 PM, David wrote: Hello, I am currently working on gl3n - https://bitbucket.org/dav1d/gl3n - gl3n provides all the math you need to work with OpenGL, DirectX or just vectors and matrices (it's mainly targeted at graphics - gl3n will never be more then a pure math library). What it supports: * vectors * matrices * quaternions * interpolation (lerp, slerp, hermite, catmull rom, nearest) * nearly all glsl functions (according to spec 4.1) * some more cool features, like templated types (vectors, matrices, quats), cool ctors, dynamic swizzling And the best is, it's MIT licensed ;). Unfortunatly there's no documentation yet, but it shouldn't be hard to understand how to use it, if you run anytime into troubles just take a look into the source, I did add to every part of the lib unittests, so you can see how it works when looking at the unittests, furthermore I am very often at #D on freenode. But gl3n isn't finished! My current plans are to add more interpolation functions and the rest of the glsl defined functions, but I am new to graphics programming (about 4 months I am now into OpenGL), so tell me what you're missing, the chances are good that I'll implement and add it. So let me know what you think about it. Before I forget it, a bit of code to show you how to use gl3n: vec4 v4 = vec4(1.0f, vec3(2.0f, 3.0f, 4.0f)); vec4 v4 = vec4(1.0f, vec4(1.0f, 2.0f, 3.0f, 4.0f).xyz)); // dynamic swizzling with opDispatch vec3 v3 = my_3dvec.rgb; float[] foo = v4.xyzzzwzyyxw // not useful but possible! glUniformMatrix4fv(location, 1, GL_TRUE, mat4.translation(-0.5f, -0.54f, 0.42f).rotatex(PI).rotatez(PI/2).value_ptr); // yes they are row major! mat3 inv_view = view.rotation; mat3 inv_view = mat3(view); mat4 m4 = mat4(vec4(1.0f, 2.0f, 3.0f, 4.0f), 5.0f, 6.0f, 7.0f, 8.0f, vec4(…) …); struct Camera { vec3 position = vec3(0.0f, 0.0f, 0.0f); quat orientation = quat.identity; Camera rotatex(real alpha) { orientation.rotatex(alpha); return this; } Camera rotatey(real alpha) { orientation.rotatey(alpha); return this; } Camera rotatez(real alpha) { orientation.rotatez(alpha); return this; } Camera move(float x, float y, float z) { position += vec3(x, y, z); return this; } Camera move(vec3 s) { position += s; return this; } @property camera() { //writefln(yaw: %s, pitch: %s, roll: %s, degrees(orientation.yaw), degrees(orientation.pitch), degrees(orientation.roll)); return mat4.translation(position.x, position.y, position.z) * orientation.to_matrix!(4,4); } } glUniformMatrix4fv(programs.main.view, 1, GL_TRUE, cam.camera.value_ptr); glUniformMatrix3fv(programs.main.inv_rot, 1, GL_TRUE, cam.orientation.to_matrix!(3,3).inverse.value_ptr); I hope this gave you a little introduction of gl3n. - dav1d
Re: Java Scala
On 12/3/2011 10:39 AM, Andrei Alexandrescu wrote: On 12/3/11 3:02 AM, Russel Winder wrote: The PyPy JIT is clearly a big win. I am sure Armin will come up with more stuff :-) Do they do anything about the GIL? Andrei Unfortunately, no. I checked into this at one point because I basically use parallelism for everything in D and have an 8-core computer at work. Therefore, if PyPy is a factor of 5 (just making up numbers) slower than D for equivalently written code, it's 40x slower once you consider that parallelism is easy in D and really hard except at the coarsest grained levels in PyPy.
Second Round Review of CURL Wrapper
I volunteered ages ago to manage the review for the second round of Jonas Drewsen's CURL wrapper. After the first round it was decided that, after a large number of minor issues were fixed, a second round would be necessary. Significant open issues: 1. Should libcurl be bundled with DMD on Windows? 2. etc.curl, std.curl, or std.net.curl? (We had a vote a while back but it was buried deep in a thread and a lot of people may have missed it: http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab ) Code: https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d Docs: http://freeze.steamwinter.com/D/web/phobos/etc_curl.html For those of you on Windows, a libcurl binary built by DMC is available at http://gool.googlecode.com/files/libcurl_7.21.7.zip. Review starts now and ends on December 16, followed by one week of voting. __Please post all reviews to digitalmars.D, not to the announcement forum.__
Re: Java Scala
On 12/2/2011 3:08 AM, Walter Bright wrote: On 12/1/2011 11:59 PM, Russel Winder wrote: (*) RPython is a subset of Python which allows for the creation of native code executables of interpreters, compilers, etc. that are provably faster than hand written C. http://pypy.org/ Provably faster? I can't find support for that on http://pypy.org http://speed.pypy.org/ Not exactly rigorous mathematical proof, but pretty strong evidence. Also, I use PyPy once in a while for projects where speed matters a little but I want to share my code with Python people or want to use Python's huge standard library. Anecdotally, it's definitely faster. The reason has nothing to do with the language it's written in. It's because PyPy JIT compiles a lot of the Python code instead of interpreting it.
Second Round CURL Wrapper Review
I volunteered ages ago to manage the review for the second round of Jonas Drewsen's CURL wrapper. After the first round it was decided that, after a large number of minor issues were fixed, a second round would be necessary. Significant open issues: 1. Should libcurl be bundled with DMD on Windows? 2. etc.curl, std.curl, or std.net.curl? (We had a vote a while back but it was buried deep in a thread and a lot of people may have missed it: http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab ) Code: https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d Docs: http://freeze.steamwinter.com/D/web/phobos/etc_curl.html For those of you on Windows, a libcurl binary built by DMC is available at http://gool.googlecode.com/files/libcurl_7.21.7.zip. Review starts now and ends on December 16, followed by one week of voting.
Re: Is D more cryptic than C++?
On 11/30/2011 11:32 PM, Abrahm wrote: Jesse Phillipsjessekphillip...@gmail.com wrote in message news:jb6qfv$1kut$1...@digitalmars.com... What bearophile was referring to was the use of templates is common. Are you sure about that? What say you Bear? D's templates have the advantage of being easier on the eyes and more powerful (with the inclusion of 'static if' in the language). Having come from C++land, and knowing what some people do with it, making it EASIER to apply templates does not seem necessarily a good thing to me. (Ref: template metaprogramming). That said, does your statement above about D's template machinery being powerful etc., mean it's easier to do template metaprogramming in D? If so, I, personally, do not find that any asset at all (though I know some surely will, for there have been books written on that abhorrence). A lot of people from C++ backgrounds say this. What they miss is that template metaprogramming in C++ is so ugly because the language wasn't designed for it. In D you can do readable template metaprogramming.
Re: Phobos Wish List/Next in Review Queue?
On 11/23/2011 9:26 PM, Walter Bright wrote: On 11/19/2011 7:02 PM, dsimcha wrote: * Streams. (Another item where the bottleneck is mostly at the design level and people not really knowing what they want.) I'm not sure what the purpose of streams would be, now that we have ranges. Right. As I mentioned in a previous post buried deep in this thread, I think streams should just be a flavor of ranges that have most or all of the following characteristics: 1. Live in std.stream. 2. Oriented toward I/O. 3. Heavy use of higher order ranges/stacking for things like compression/decompression and encryption/decryption. 4. Mostly focused on input ranges as opposed to random access/forward/bidirectional, since this is the best model for data from a network or stdin.
std.csv accepted into Phobos
I'm pleased to announce that, by a vote of 5-1, std.csv has been accepted into Phobos. Also, by a vote of 3-2 with one abstention, the community has decided on Version 2 of the library (the one where the Record struct, etc. is hidden in a style similar to std.algorithm rather than explicitly documented). Congratulations, Jesse.
Re: Phobos Wish List/Next in Review Queue?
On 11/20/2011 12:30 PM, Jonas Drewsen wrote: * Containers. (AFAIK noone is working on this. It's tough to get started because, despite lots of discussion at various times on this forum, noone seems to really know what they want. Since the containers in question are well-known, it's much more a design problem than an implementation problem.) * Allocators. (I think Phobos desperately needs a segmented stack/region based allocator and I've written one. I've also tried to define a generic allocator API, mostly following Andrei's suggestions, but I'll admit that I didn't really know what I was doing for the general API. Andrei has suggested that allocators should have real-world testing on containers before being included in Phobos. Therefore, containers block allocators and if the same person doesn't write both, there will be a lot of communication overhead to make sure the designs are in sync.) I've though about doing some containers myself but have hesitated since the general opinion seem to be that allocators need to be in place first. Yeah, this is problematic. In voting against my allocator proposal, Andrei mentioned that he wanted the allocators to be well-tested in the container API first. This means either we have a circular dependency or allocators and containers need to be co-developed. Co-developing them is problematic. If one person does containers and another allocators, the project might be overwhelmed by communication overhead. If the same person does both, then this is asking a pretty lot for a hobby project. Of course, I hope to graduate in 1 year and will be looking for a job when I do. Any company out there have a strategic interest in D and want to hire me to work full-time on allocators and containers? * Streams. (Another item where the bottleneck is mostly at the design level and people not really knowing what they want.) What does streams provide that could not be provided by ranges? If I understand correctly, streams _would_ be a flavor of ranges. They would just be ranges that are geared towards being stacked on top of one another specifically for the purpose of I/O. They would typically be design around the vanilla input range (not forward, random access, etc.) or output ranges. Traditionally, streams would also be class based instead of template based. However, IMHO a good case can be made for template based stream ranges in D because we have std.range.inputRangeObject and std.range.outputRangeObject. This means that you can stack streams using templates, with no virtual function call overhead, and then if you need to respect some binary interface you could just stack an inputRangeObject() or outputRangeObject() on top of all your other crap and only have one virtual call. Example: auto lines = lineReader( gzipUncompresser( rawFile(foo.gz) ) ); // LineReader!(GzipUncompresser!(RawFile))) pragma(msg, typeof(lines)); auto objectOriented = inputRangeObject(lines); // InputRangeObject!(char[]) pragma(msg, typeof(objectOriented));
Re: Phobos Wish List/Next in Review Queue?
On 11/20/2011 12:30 PM, Jonas Drewsen wrote: * Some higher level networking support, such as HTTP, FTP, etc. (Jonas Drewsen's CURL wrapper handles a lot of this and may be ready for a second round of review.) As I've mentioned in another thread it is ready for a second round of review. We just need someone to step up and run the review since it wouldn't be apropriate for me to do it myself. Anyone? If noone else wants to volunteer, I will again. Is there something I'm missing? I find that being review manage takes very little effort: Post an initial review announcement, post a reminder, maybe post a summary/moderation message here and there, post a vote message, post a vote reminder, tally votes. It really doesn't take much time.
Phobos Wish List/Next in Review Queue?
Now that we've got a lot of contributors to Phobos and many projects in the works, I decided to start a thread to help us make a rough plan for Phobos's short-to-medium term development. There are three goals here: 1. Determine what's next in the review queue after std.csv (voting on std.csv ends tonight, so **please vote**). 2. Come up with a wish list of high-priority modules that Phobos is missing that would make D a substantially more attractive language than it is now. 3. Figure out who's already working on what from the wish list and what bottlenecks, if any, are getting in the way and what can be done about them. The following is the wish list as I see it. Please suggest additions and correct any errors, as this is mostly off the top of my head. Also, status updates if you're working on any of these and anything substantial has changed would be appreciated. * Some higher level networking support, such as HTTP, FTP, etc. (Jonas Drewsen's CURL wrapper handles a lot of this and may be ready for a second round of review.) * Serialization. (Jacob Carolberg's Orange library might be a good candidate. IIRC he said it's close to ready for review.) * Encryption and hashing. (This is more an implementation problem than a design problem and AFAIK noone is working on it.) * Containers. (AFAIK noone is working on this. It's tough to get started because, despite lots of discussion at various times on this forum, noone seems to really know what they want. Since the containers in question are well-known, it's much more a design problem than an implementation problem.) * Allocators. (I think Phobos desperately needs a segmented stack/region based allocator and I've written one. I've also tried to define a generic allocator API, mostly following Andrei's suggestions, but I'll admit that I didn't really know what I was doing for the general API. Andrei has suggested that allocators should have real-world testing on containers before being included in Phobos. Therefore, containers block allocators and if the same person doesn't write both, there will be a lot of communication overhead to make sure the designs are in sync.) * Streams. (Another item where the bottleneck is mostly at the design level and people not really knowing what they want.) * Compression/archiving. (Opening standard compressed/archived file formats needs to just work. This includes at least zip, gzip, tar and bzip2. Of course, zip already is available and gzip is supported by the zlib module but with a crufty C API. At least gzip and bzip2, which are stream-based as opposed to file-based, should be handled via streams, which means that streams block compression/archiving. Also, since tar and zip are both file based, they should probably be handled by the same API, which might mean deprecating std.zip and rewriting it.) * An improved std.xml. (I think Thomas Sowinski is working on a replacement, but I haven't seen any updates in a long time.) * Matrices and linear algebra. (Cristi Cobzarenco's GSoC project is a good starting point but it needs polish. I've been in contact with him occasionally since GSoC ended and he indicated that he wants to get back to working on it but doesn't have time. I've contributed to it sparingly, but find it difficult because I haven't gotten around to familiarizing myself with the implementation details yet, and it's hard to get into a project that complex with a few hours a week as opposed to focusing full time on it.) * std.database. (Apparently Steve Teale is working on this. This is a large, complicated project because we're trying to define a common API for a variety of RDBMSs. Again, it's more a design problem than an implementation problem.) * Better support for creating processes/new std.process. (Lars Kyllingstad wrote a replacement candidate for Posix and Steve Schveighoffer ported it to Windows, but issues with the DMC runtime prevent it from working on Windows.) * Parallel algorithms. (I've implemented a decent amount of these in my std.parallel_algorithm Github project, but I've become somewhat frustrated and unmotivated to finish this project because so many of the relevant algorithms seem memory bandwidth bound and aren't substantially faster when parallelized than when run serially.) After writing this, the general pattern I notice is that lots of stuff is blocked by design, not implementation. In a lot of cases people don't really know what they want and analysis paralysis results.