date:20110319

Re: [TDPL] Russian translation of the book

2011-03-19 Thread Alexander Malakhov

Simen kjaeraas  писал(а) в своём письме Sat, 19  
Mar 2011 04:17:13 +0600:


On Fri, 18 Mar 2011 18:02:03 +0100, Vladimir Panteleev  
 wrote:


On Fri, 18 Mar 2011 13:08:24 +0200, Alexander Malakhov  
 wrote:


Vladimir Panteleev  писал(а) в своём  
письме Wed, 16 Mar 2011 23:54:14 +0600:


On Wed, 16 Mar 2011 19:10:29 +0200, Alexander Malakhov  
 wrote:


Russian publisher "Символ-Плюс" (Symobl-Plus) now is translating  
TDPL and they are asking for volunteers to

* help translating guy with technical details
* read final version

If you wish to help, add your contacts on forum:
http://www.symbol.ru/forum/viewtopic.php?f=4&t=363


Note that the last message in that thread is from the last year.


Actually it's from 2011.01.20. Go to the 2nd page :)


Who puts paging controls at the top of threads? :s


In soviet russia...?




*LOL*

--
Alexander

Re: Quo vadis, D2? Thoughts on the D library ecosystem.

2011-03-19 Thread Caligo

On Sat, Mar 19, 2011 at 6:12 PM, Jonathan M Davis wrote:

>
> Really, the problem is that someone needs to take the initiative on this.
> They
> need to work on setting it up and supporting the ecosystem which would
> result in
> a group of such projects. Good ideas tend to be presented around here and
> then
> go nowhere, because no one actually takes the initiative to do them. The
> "wouldn't this be a good idea?" tactic doesn't tend to get very far, even
> if
> everyone agrees, simply because someone has to put in the time and effort
> to do
> it, and while people may think that it's a good idea, there are only so
> many
> people working on Phobos and other D-related stuff, and there's a lot to be
> done,
> and everyone has something that they'd like to see done, and _that_ is what
> they're generally working on.
>
>
> - Jonathan M Davis
>

It's not that someone needs to take the initiative, it's just that there
aren't that many D developers.  I hope things improve once GDC is officially
part of GCC and D becomes available on all GNU/Linux OSs.  Another thing
that might need to happen is for the D project to join The Software Freedome
Conservancy (SFC) or form its own non-profit 501(c)(3) organization, similar
to Python Software Foundation.

But that's just and I could be wrong.

Re: review of std.parallelism

2011-03-19 Thread dsimcha


On 3/19/2011 4:35 PM, Andrei Alexandrescu wrote:

On 03/19/2011 12:16 PM, dsimcha wrote:

On 3/19/2011 12:03 PM, Andrei Alexandrescu wrote:

On 03/19/2011 02:32 AM, dsimcha wrote:

Ok, thanks again for clarifying **how** the docs could be improved.
I've
implemented the suggestions and generally given the docs a good reading
over and clean up. The new docs are at:

http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html


* Still no synopsis example that illustrates in a catchy way the most
attractive artifacts.


I don't see what I could put here that isn't totally redundant with the
rest of the documentation. Anything I could think of would basically
just involve concatentating all the examples. Furthermore, none of the
other Phobos modules have this, so I don't know what one should look
like.


I'm thinking along the lines of:

http://www.digitalmars.com/d/2.0/phobos/std_exception.html

A nice synopsis would be the pi computation. Just move that up to the
synopsis. It's simple, clean, and easy to relate to. Generally, you'd
put here not all details but the stuff you think would make it easiest
for people to get into your library.


Good example, will do.


* "After creation, Task objects are submitted to a TaskPool for
execution." I understand it's possible to use Task straight as a
promise/future, so s/are/may be/.


No. The only way Task is useful is by submitting it to a pool to be
executed. (Though this may change, see below.)


I very much hope this does change. Otherwise the role of Task in the
design could be drastically reduced (e.g. nested type inside of
TaskPool) without prejudice. At the minimum I want to be able to create
a task, launch it, and check its result later without involving a pool.
A pool is when I have many tasks that may exceed the number of CPUs etc.
Simplicity would be great.

// start three reads
auto readFoo = task!readText("foo.txt");
auto readBar = task!readText("bar.txt");
auto readBaz = task!readText("baz.txt");
// join'em all
auto foo = readFoo.yieldWait();
auto bar = readBar.yieldWait();
auto baz = readBaz.yieldWait();


This is definitely feasible in principle.  I'd like to implement it, but 
there's a few annoying, hairy details standing in the way.  For reasons 
I detailed previously, we need both scoped and non-scoped tasks.  We 
also have alias vs. callable (i.e. function pointer or delegate) tasks. 
 Now we're adding pool vs. new-thread tasks.  This is turning into a 
combinatorial explosion and needs to be simplified somehow.  I propose 
the following:


1.  I've reconsidered and actually like the idea of task() vs. 
scopedTask().  task() returns a pointer on the heap.  scopedTask() 
returns a struct on the stack.  Neither would be a member function of 
TaskPool.


2.  Non-scoped Task pointers would need to be explicitly submitted to 
the task pool via the put() method.  This means getting rid of 
TaskPool.task().


3.  The Task struct would grow a function runInNewThread() or something 
similar.  (If you think this would be a common case, even just execute() 
might cut it.)


The work flow would now be that you call task() to get a heap-allocated 
Task*, or scopedTask to get a stack-allocated Task.  You then call 
either TaskPool.put() to execute it on a pool or Task.runInNewThread() 
to run it in a new thread.  The creation of the Task is completely 
orthogonal to how it's run.




There's no need at this level for a task pool. What would be nice would
be to have a join() that joins all tasks spawned by the current thread:

// start three reads
auto readFoo = task!readText("foo.txt");
auto readBar = task!readText("bar.txt");
auto readBaz = task!readText("baz.txt");
// join'em all
join();
// fetch results
auto foo = readFoo.spinWait();
auto bar = readBar.spinWait();
auto baz = readBaz.spinWait();



I don't understand how this would be a substantial improvement over the 
first example, where you just call yieldWait() on all three. 
Furthermore, implementing join() as shown in this example would require 
some kind of central registry of all tasks/worker threads/task 
pools/something similar, which would be a huge PITA to implement 
efficiently.





Secondly, I think you're reading **WAY** too much into what was meant to
be a simple example to illustrate usage mechanics. This is another case
where I can't think of a small, cute example of where you'd really need
the pool. There are plenty of larger examples, but the smallest/most
self-contained one I can think of is a parallel sort. I decided to use
file reading because it was good enough to illustrate the mechanics of
usage, even if it didn't illustrate a particularly good use case.


It's impossible to not have a good small example. Sorting is great. You
have the partition primitive already in std.algorithm, then off you go
with tasks. Dot product on dense vectors is another good one. There's
just plenty of operations that people understand are important to make
fast.


I forgot about std.algorithm.partition

Re: Is there a working web reader for this newsgroup?

2011-03-19 Thread soarowl zhuo


Hello Kagamin,


webnews doesn't show pages other than the first one, and pnews can't
post. Is there a working webreader?



Free RSS Reader and Newsgroup Aggregator
http://www.jetbrains.com/omea/reader/index.html

Re: Conversion to string + string building benchmark

2011-03-19 Thread bearophile

Robert Jacques:

Happy to see my post was not fully lost in the noise :-)

> And I would  
> hazard that Java's StringBuilder isn't giving you O(1) access to the  
> underlying array like Appender is, which would allow it to drastically  
> reduce memory churn.

The first purpose of an Appender/builder is to build an array as fast as 
possible. I don't need a very access to the array while I build it (a Deque too 
allows O(1) too, just a bit slower).


> In the future, you should also include program ram usages in these kind of  
> benchmarks.

The amount of memory used/commited is less easy to measure precisely, here are 
approximated values, the result are weird:

Timings, best of 3, n = 10_000_000, MB (commit):
  D test1:  290
  D test2a: 186
  D test2b:   1.9
  D test3:  188
  D test4:  106
  Java -Xmx500M -server Test1:  355
  Java -Xmx500M -server Test2a: 355
  Java -Xmx500M -server Test2b: 355

Bye,
bearophile

Re: review of std.parallelism

2011-03-19 Thread dsimcha


On 3/19/2011 8:48 PM, Jonathan M Davis wrote:

On Saturday 19 March 2011 17:31:18 dsimcha wrote:

On 3/19/2011 4:35 PM, Andrei Alexandrescu wrote:

Furthermore, you should expect that the review process will prompt
changes. My perception is that you consider the submission more or less
final modulo possibly a few minor nits. You shouldn't. I'm convinced you
know much more about SMP than most or all others in this group, but in
no way that means your design has reached perfection and is beyond
improvement even from a non-expert.


In addition the the deadline issues already mentioned and resolved, I
did misunderstand the review process somewhat.  I didn't participate in
the reviews for std.datetime (because I know nothing about what makes a
good date/time lib) or for std.unittest (because I was fairly ambivalent
about it), so I didn't learn anything from them.  I was under the
impression that the module is **expected** to be very close to its final
form and that, if a lot of issues are found, then that basically means
the proposal is going to be rejected.


Both std.datetime and std.unittests underwent a fair number of changes over the
course the review process. A lot of the stuff stayed the same, but a lot of it
changed too. On the whole, though, the end results were much better for it.

- Jonathan M Davis


Please check your newsreader settings.  You've been double-posting a lot 
lately.

Re: Conversion to string + string building benchmark

2011-03-19 Thread Robert Jacques

On Thu, 17 Mar 2011 11:06:34 -0400, bearophile   
wrote:



About the versions:

This benchmark (test1/Test1) comes from a reduction of some code of  
mine, it converts integer numbers to string and builds a single very  
long string with them. I have seen the Java code significantly faster  
than the D one.


The successive benchmarks are experiments to better understand where the  
low performance comes from:


test2a/Test2a just build the string, without integer conversions.

test2b/Test2b just convert ints to strings.

test3 is a try to perform a faster int to string conversion using the C  
library.


test4 is my faster D solution, it shows that there are ways to write a D  
program faster than the Java code, but they aren't handy.


Hi bearophile,
Since I've been working on this problem recently, here's an analysis of  
what's happening: Both Appender and your test cases work by growing an  
array in a 'smart' manner. The issue with this approach is that once the  
arrays get big the conservative GC starts pinning them due to false  
pointers. This slows down the GC a lot (due to some naivety in the GC,  
which is being patched) and creates excessive memory usage. And I would  
hazard that Java's StringBuilder isn't giving you O(1) access to the  
underlying array like Appender is, which would allow it to drastically  
reduce memory churn.


In the future, you should also include program ram usages in these kind of  
benchmarks.

Re: review of std.parallelism

2011-03-19 Thread Jonathan M Davis

On Saturday 19 March 2011 17:31:18 dsimcha wrote:
> On 3/19/2011 4:35 PM, Andrei Alexandrescu wrote:
> > Furthermore, you should expect that the review process will prompt
> > changes. My perception is that you consider the submission more or less
> > final modulo possibly a few minor nits. You shouldn't. I'm convinced you
> > know much more about SMP than most or all others in this group, but in
> > no way that means your design has reached perfection and is beyond
> > improvement even from a non-expert.
> 
> In addition the the deadline issues already mentioned and resolved, I
> did misunderstand the review process somewhat.  I didn't participate in
> the reviews for std.datetime (because I know nothing about what makes a
> good date/time lib) or for std.unittest (because I was fairly ambivalent
> about it), so I didn't learn anything from them.  I was under the
> impression that the module is **expected** to be very close to its final
> form and that, if a lot of issues are found, then that basically means
> the proposal is going to be rejected.

Both std.datetime and std.unittests underwent a fair number of changes over the 
course the review process. A lot of the stuff stayed the same, but a lot of it 
changed too. On the whole, though, the end results were much better for it.

- Jonathan M Davis

Re: review of std.parallelism

2011-03-19 Thread dsimcha


On 3/19/2011 4:35 PM, Andrei Alexandrescu wrote:

Furthermore, you should expect that the review process will prompt
changes. My perception is that you consider the submission more or less
final modulo possibly a few minor nits. You shouldn't. I'm convinced you
know much more about SMP than most or all others in this group, but in
no way that means your design has reached perfection and is beyond
improvement even from a non-expert.


In addition the the deadline issues already mentioned and resolved, I 
did misunderstand the review process somewhat.  I didn't participate in 
the reviews for std.datetime (because I know nothing about what makes a 
good date/time lib) or for std.unittest (because I was fairly ambivalent 
about it), so I didn't learn anything from them.  I was under the 
impression that the module is **expected** to be very close to its final 
form and that, if a lot of issues are found, then that basically means 
the proposal is going to be rejected.

Re: Quo vadis, D2? Thoughts on the D library ecosystem.

2011-03-19 Thread Andrei Alexandrescu


On 03/19/2011 06:12 PM, Jonathan M Davis wrote:

Really, the problem is that someone needs to take the initiative on this. They
need to work on setting it up and supporting the ecosystem which would result in
a group of such projects. Good ideas tend to be presented around here and then
go nowhere, because no one actually takes the initiative to do them. The
"wouldn't this be a good idea?" tactic doesn't tend to get very far, even if
everyone agrees, simply because someone has to put in the time and effort to do
it, and while people may think that it's a good idea, there are only so many
people working on Phobos and other D-related stuff, and there's a lot to be 
done,
and everyone has something that they'd like to see done, and _that_ is what
they're generally working on.


Words of the wise.

It's like hearing a woman for the laundry list of stuff she'd like in a 
man and compare (and contrast) that with the kind of man she does 
respond to. The "desirable stuff" list is built using a rational 
response, whereas in reality it's the emotional response that guides 
decisions. We hackers seem to use a similar process when deciding what 
to work on :o).


That's in particular why we all know e.g. networking is a necessity, but 
instead we work on various other stuff. The person who has an emotional 
positive response for networking code hasn't showed up yet - or just has 
in the person of Jonas.



Andrei

Re: review of std.parallelism

2011-03-19 Thread dsimcha


On 3/19/2011 6:58 PM, Andrei Alexandrescu wrote:

On 03/19/2011 04:25 PM, dsimcha wrote:

On 3/19/2011 4:35 PM, Andrei Alexandrescu wrote:

I know you'd have no problem finding the right voice in this discussion
if you only frame it in the right light. Again, people are trying to
help (however awkwardly) and in no way is that ridiculous.


Fair enough. Now that I think of it most of my frustration is that these
details are only getting looked at now, when I have a week (and an
otherwise very busy week) to fix all this stuff, when this module has
been officially in review for the past two weeks and unofficially for
several months. I would be much more open to this process if the issues
raised could be fixed at my leisure rather than on a hard and tight
deadline. This is exacerbated by the fact that I have another important,
unrelated deadline, also next Friday.

At the same time, though, I'm afraid that if we didn't fix a vote date
and put some urgency into things, the pace of the reviews would be
glacial at best, like it was for the first two weeks of official review
and the months of unofficial review.


Exactly. I understand. Well here's a suggestion. How about we "git
stash" this review? I recall another submission was vying for the queue
so we can proceed with that while you have time to operate the changes
you want. If not, we can simply wait 2-3 weeks and then have you
resubmit for a shorter process (1-2 weeks) that would gather more
feedback (hopefully minor that time) and votes.


Sounds like a plan.

Re: review of std.parallelism

2011-03-19 Thread dsimcha


On 3/19/2011 7:33 PM, Daniel Gibson wrote:

Am 19.03.2011 22:45, schrieb dsimcha:

IMHO all early termination that affects subsequent loop iterations as well
(break, goto, labeled break and continue, but not regular continue) should just
throw because they make absolutely no sense in a parallel context.


What about some kind of parallel search? You just want to know if something is
there and on first sight you're done, so you want to break out of this iteration
and don't want any new parallel iterations to be done.
(Not sure what should be done with currently running iterations.. should they be
killed or go on until they're done - anyway, the executing thread shouldn't
start a new iteration after that)


It's an interesting suggestion, and I'll give some thought to it, but on 
first glance it sounds **very** difficult to implement efficiently.  To 
avoid doing a lot of extra work after you've found what you need, you'd 
need to use very small work units.  To avoid excessive overhead you'd 
need to use fairly large work units.  The only case I can think of where 
this would be do-able is when evaluating the predicate is very expensive.

Re: review of std.parallelism

2011-03-19 Thread Daniel Gibson

Am 19.03.2011 22:45, schrieb dsimcha:
> IMHO all early termination that affects subsequent loop iterations as well
> (break, goto, labeled break and continue, but not regular continue) should 
> just
> throw because they make absolutely no sense in a parallel context.

What about some kind of parallel search? You just want to know if something is
there and on first sight you're done, so you want to break out of this iteration
and don't want any new parallel iterations to be done.
(Not sure what should be done with currently running iterations.. should they be
killed or go on until they're done - anyway, the executing thread shouldn't
start a new iteration after that)

Re: Trivial DMD fixes: GitHub pull requests vs. Bugzilla issues

2011-03-19 Thread Jonathan M Davis

On Saturday 19 March 2011 08:43:44 David Nadlinger wrote:
> For almost a month now, I have a trivial pull request open for DMD:
> https://github.com/D-Programming-Language/dmd/pull/10. It's only about
> adding the word »length« in two places to clarify the tuple
> out-of-bounds error message, so I didn't bother to open a ticket for it
> because I figured that it would only create unneeded administrative
> overhead for such a small change.
> 
> However, given that the commit has not been merged yet: Walter, do you
> still prefer Bugzilla issues for this kind of patches?

Bugzilla is for tracking bugs. github is for tracking the source. So, in 
general, I would expect bugs to be reported to bugzilla.  I would guess though, 
that enhancement requests aren't quite as critical if you already have a patch 
for them (though you're likely to get a better discussion on bugzilla if you 
post enhancements there in addition to creating a pull request). However, given 
that this sounds like a very small change, it's probably fine that it's just a 
pull request - though, of courses, Walter would be better suited to say what 
Walter would prefer.

Regardless, as I understand it, Walter has been a bit overwhelmed with pull 
requests of late (you can check out the thread on dmd-internals about it: 
http://lists.puremagic.com/pipermail/dmd-internals/2011-March/001293.html ), 
and 
it's taking him some time to work through them. I expect that he'll get to your 
pull request eventually.

- Jonathan M Davis

Re: Quo vadis, D2? Thoughts on the D library ecosystem.

2011-03-19 Thread Jonathan M Davis

On Saturday 19 March 2011 07:43:37 David Nadlinger wrote:
> While lying in the bed with fever yesterday (so please excuse any
> careless mistakes), I was pondering a bit about the current discussions
> regarding Phobos additions, package management, etc. It occurred to me
> that there is a central unanswered question, which I think deserves to
> be broadly discussed right now.
> 
> But first, let me start out by describing how I see current situation
> regarding D2. Leaving aside a few minor things like @property
> enforcement or the recent suggestions about a new alias syntax, the
> language is fairly stable and critical bugs in DMD 2 are not frequent
> enough to make it completely unusable for day-to-day development
> anymore. Of course, there is still a large way to go for the D toolchain
> (with the ideal result being a rock-solid self-hosting compiler
> front-end, usable as a library as well), but in a sense, we are more or
> less at the end of a certain stage of D2 development.
> 
> I think most of you would agree with me if I say that the main goal for
> D2 right now should be to build a vibrant library ecosystem around the
> language, to foster adoption in real-world applications. There has been
> a number of related discussions recently, but as mentioned above, I
> think there is a central question:
> 
> Have we reached the critical mass yet where it makes sense to split the
> effort in a number of smaller library projects, or are we off better
> with concentrating on a central, comprehensive standard library
> (Phobos), considering the current community size?
> 
> I do not really have an answer to this question, but here are a few
> thoughts on the topic, which might also help to make clearer what I mean:
> 
> I think that adopting a Boost-like review process for Phobos has
> certainly been a clever and valuable move, for more than one reason.
> First, together with the move to Git, it has helped to reinforce the
> point that D2 and Phobos are open to contributions from everyone, given
> that they meet certain quality standards. Second, it certainly boosts
> code quality of further standard library additions, which had been a
> problem for some parts in the past (at least from my point of view, no
> offense intended). Third, and this overlaps with another point below, I
> think that the quality improvements will also help to reduce bit rot,
> which has traditionally been a problem with D libraries.
> 
> But however good a fit this model is for the standard library, I think
> it is no silver bullet either. There are small, one-off style projects,
> arising from a central need, where the amount of time needed to get the
> code through the whole review process is prohibitive – even if the code
> quality was high enough –, but the result is still usable for the wide
> public. Common examples for this would be low-level wrappers for C
> libraries, although they don't really qualify for inclusion into Phobos
> for other reasons (often, another wrapper layer is needed to be usable
> with common D idioms). Also, people new to the language might be scared
> away by the mere thought of contributing to a standard library. How to
> make sure that these libraries are not forgotten? Maybe a central
> package system with SCM (Git, …) integration can help here?
> 
> And, which brings me to the next point, how to fight the unfavorable
> outcome of having a huge inscrutable pile of half-finished bit-rotten
> code, a problem that DSource is currently experiencing? A central,
> well-maintained standard library effort with a wider scope could
> certainly help to reduce this problem, at least from the (D) user side,
> but on the other hand, larger amounts of code de facto becoming
> unmaintained would be a problem for it as well.
> 
> Should we build something like a staging area, an incubator for
> community contributions not taken yet through formal review, but of
> interest for a wider audience? What about the etc.* package – would it
> be an option to expand it into such an incubation area? If not, what
> should it evolve into – a collection of C-level library bindings (see
> the recent discussion on SQLite bindings started by David Simcha)? Who
> will take care of the maintenance duties?
> 
> Looking forward to a stimulating discussion,
> David

There has been some discussion in the past of creating an incubator project of 
sorts where code which may or may not make it into Phobos can be put so that it 
can develop and evolve with people actually using it - maybe even using it 
heavily - before it tried to get into Phobos. Once a library was considered 
mature enough, it could go through the Phobos review process and attempt to get 
into the standard library. If it succeeded, then, in theory, we'd have a well-
used and well-tested library added to Phobos. If it failed, it would still be 
around for people to use, and it could continue to be used and evolve - either 
to make a later attempt at inclusion in Phobos

Re: review of std.parallelism

2011-03-19 Thread Andrei Alexandrescu


On 03/19/2011 04:25 PM, dsimcha wrote:

On 3/19/2011 4:35 PM, Andrei Alexandrescu wrote:

I know you'd have no problem finding the right voice in this discussion
if you only frame it in the right light. Again, people are trying to
help (however awkwardly) and in no way is that ridiculous.


Fair enough. Now that I think of it most of my frustration is that these
details are only getting looked at now, when I have a week (and an
otherwise very busy week) to fix all this stuff, when this module has
been officially in review for the past two weeks and unofficially for
several months. I would be much more open to this process if the issues
raised could be fixed at my leisure rather than on a hard and tight
deadline. This is exacerbated by the fact that I have another important,
unrelated deadline, also next Friday.

At the same time, though, I'm afraid that if we didn't fix a vote date
and put some urgency into things, the pace of the reviews would be
glacial at best, like it was for the first two weeks of official review
and the months of unofficial review.


Exactly. I understand. Well here's a suggestion. How about we "git 
stash" this review? I recall another submission was vying for the queue 
so we can proceed with that while you have time to operate the changes 
you want. If not, we can simply wait 2-3 weeks and then have you 
resubmit for a shorter process (1-2 weeks) that would gather more 
feedback (hopefully minor that time) and votes.


Lars?


Also increasing the deadline pressure issue, Michael Fortin just caused
me to rethink the issue of exception handling in parallel foreach. I had
more-or-less working code for this, but I realized it's severely broken
in subtle ways that I've (knock on wood) never actually run into in real
world code. It's gonna take some time to fix. These kinds of issues with
error handling code can very easily slip under the radar in a library
with only a few users, and unfortunately, one has. I definitely know how
to fix it, but I have to implement the fix and it's somewhat
non-trivial, i.e. I'm debugging it now and it's looking like a marathon
debugging session.


Fixing these sooner rather than later would be great, so  all the better 
for suspending the review.


Andrei

Re: review of std.parallelism

2011-03-19 Thread Jonathan M Davis

On Saturday 19 March 2011 09:03:51 Andrei Alexandrescu wrote:
> On 03/19/2011 02:32 AM, dsimcha wrote:
> > Ok, thanks again for clarifying **how** the docs could be improved. I've
> > implemented the suggestions and generally given the docs a good reading
> > over and clean up. The new docs are at:
> > 
> > http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html
> 
> * When using "parallelism" as a common noun, prefix is with a '_' so
> ddoc doesn't underline it.

Oooh. That's a neat trick. I didn't know that you could do that.

- Jonathan M Davis

Re: Is there a working web reader for this newsgroup?

2011-03-19 Thread Gour

On Sat, 19 Mar 2011 16:11:48 -0400
Kagamin  wrote:

> webnews doesn't show pages other than the first one, and pnews can't
> post. Is there a working webreader?

Gmane?


-- 
“In the material world, conceptions of good and bad are
all mental speculations…” (Sri Caitanya Mahaprabhu)

http://atmarama.net | Hlapicina (Croatia) | GPG: CDBF17CA




signature.asc
Description: PGP signature

Re: Dream package management system (Was: a cabal for D ?)

2011-03-19 Thread Andrej Mitrovic

Even AutoHotkey has a system of finding and installing libraries. It
can run example code without installing (well, it is a scripting
language), show the source, documentation and some other things.

Snapshot: http://i.imgur.com/Uv1Gr.jpg

Re: review of std.parallelism

2011-03-19 Thread dsimcha


On 3/19/2011 5:33 PM, Michel Fortin wrote:

On 2011-03-19 15:36:24 -0400, dsimcha  said:


The only problem is that there's no easy, well-documented way to tell
from the return value of opApply whether it was a break, a goto, a
labeled break/continue, etc. This would be implementable only if I
changed the semantics of break to also throw. This might not be a bad
thing (IMHO any type of breaking out of a parallel foreach loop is
just silly) but others had asked for different semantics for break.


It's not that silly.

Essentially, what you'd express like this with a normal function taking
a delegate:

taskPool.apply([1,2,3], (int i) {
if (i == 1)
return;
// do some things
});

you'd express like this in a parallel foreach:

foreach (int i; parallel([1,2,3])) {
if (i == 1)
break;
// do some things
}

It's not following the semantics of break within a foreach, but it's
still useful to be able to return early from a function (from a loop
iteration in this case), so I see the use case for making 'break' do
what it does.



Using continue is well defined behavior and does exactly what I think 
you're suggesting.  The problem with break is that it's supposed to stop 
all subsequent iterations of the loop from being executed, not just end 
the current one early.  This only makes sense in a serial context, where 
"subsequent iterations" is well-defined.  Currently break breaks from 
the current work unit but continues executing all other work units.  I 
put this semantic in because I couldn't think of anything else to make 
it do and someone (I don't remember who) asked for it.  I have never 
encountered a use case for it.


IMHO all early termination that affects subsequent loop iterations as 
well (break, goto, labeled break and continue, but not regular continue) 
should just throw because they make absolutely no sense in a parallel 
context.

Re: On alias a = b

2011-03-19 Thread Andrej Mitrovic

Maybe

template addSize(...)
{
   enum this = ...;
}

Re: On alias a = b

2011-03-19 Thread Nick Sabalausky

"KennyTM~"  wrote in message 
news:im37b7$i45$1...@digitalmars.com...
> On Mar 20, 11 05:14, Nick Sabalausky wrote:
>> "Andrei Alexandrescu"  wrote in message
>> news:im0g0n$1mal$1...@digitalmars.com...
>>> On 3/18/11 3:28 PM, so wrote:
 alias a(T) = b(T, known_type);

 Would it be an overkill?
>>>
>>> It's part of the evil plan.
>>>
>>
>> What about stuff like this?:
>>
>>  template foo() {}
>>  template foo(string str) {}
>>  template foo(int i) {}
>>
>>  alias foo bar;
>>
>>  // use bar!(), bar!("str") and bar!(7)
>>
>> Currently that doesn't work (though I forget exactly how it fails). I've
>> been starting to find that to be more and more of a problem. I don't
>> rememebr if it works or not when foo is series of non-templated function
>> overloads. If not, it should, IMO.
>>
>
> Huh? It *does* work. See http://ideone.com/moPhm.
>

Hmm, weird. That works for me on 2.052, too, and also works with function 
overloads instead of templates. I could have sworn there was something like 
that that didn't work. Maybe it was just some corner-case bug. Or maybe I'm 
just nuts and never did come across such a problem...

Re: review of std.parallelism

2011-03-19 Thread Michel Fortin


On 2011-03-19 15:36:24 -0400, dsimcha  said:


On 3/19/2011 2:35 PM, Michel Fortin wrote:

On 2011-03-19 13:16:02 -0400, dsimcha  said:


* "A goto from inside the parallel foreach loop to a label outside the
loop will result in undefined behavior." Would this be a bug in dmd?


No, it's because a goto of this form has no reasonable, useful
semantics. I should probably mention in the docs that the same applies
to labeled break and continue.

I have no idea what semantics these should have, and even if I did,
given the long odds that even one person would actually need them, I
think they'd be more trouble than they're worth to implement. For
example, once you break out of a parallel foreach loop to some
arbitrary address (and different threads can goto different labels,
etc.), well, it's no longer a parallel foreach loop. It's just a bunch
of completely unstructured threading doing god-knows-what.

Therefore, I slapped undefined behavior on it as a big sign that says,
"Just don't do it." This also has the advantage that, if anyone ever
thinks of any good, clearly useful semantics, these will be
implementable without breaking code later.


I think an improvement over undefined behaviour would be to throw an
exception.


The only problem is that there's no easy, well-documented way to tell 
from the return value of opApply whether it was a break, a goto, a 
labeled break/continue, etc.  This would be implementable only if I 
changed the semantics of break to also throw.  This might not be a bad 
thing (IMHO any type of breaking out of a parallel foreach loop is just 
silly) but others had asked for different semantics for break.


It's not that silly.

Essentially, what you'd express like this with a normal function taking 
a delegate:


taskPool.apply([1,2,3], (int i) {
if (i == 1)
return;
// do some things
});

you'd express like this in a parallel foreach:

foreach (int i; parallel([1,2,3])) {
if (i == 1)
break;
// do some things
}

It's not following the semantics of break within a foreach, but it's 
still useful to be able to return early from a function (from a loop 
iteration in this case), so I see the use case for making 'break' do 
what it does.


My only gripe is that the semantics are distorted. In fact, just making 
foreach parallel distorts its semantics. I was confused earlier about a 
foreach being parallel when it was not, someone could also be confused 
in the other direction, thinking foreach is the normal foreach when it 
actually is parallel. This makes code harder to review. Don't consider 
only my opinion on this, but in my opinion the first form above 
(taskPool.apply) is preferable because you absolutely can't mistake it 
with a normal foreach. And I think it's especially important to make it 
stand out given that the compiler can't check for low-level races.




Also, what happens if one of the tasks throws an exception?


It gets rethrown when yieldWait()/spinWait()/workWait() is called.  In 
the case of the higher-level primitives, it gets re-thrown to the 
calling thread at some non-deterministic point in the execution of 
these functions.  I didn't see the need to document this explicitly 
because it "just works".


That's indeed what I'd expect. But I think it'd still be worth 
mentioning in a short sentense in yeildWait()/spinWait()/workWait()'s 
documentation. It's comforting when the documentation confirms your 
expectations.


--
Michel Fortin
michel.for...@michelf.com
http://michelf.com/

Re: On alias a = b

2011-03-19 Thread Nick Sabalausky

"KennyTM~"  wrote in message 
news:im0gtr$1o3q$1...@digitalmars.com...
>
> enum addSize(int total, T) = total + T.sizeof;
>

The great thing about that would be decreased need for the terribly non-DRY 
emonymous template syntax:

template addSize(...)
{
enum addSize = ...;
}

Ugh. Is there anything in the works to take care of that? It's like C++'s 
naming system for constructors. I'd even be happier with something like:

template addSize(...)
{
alias _ this;
enum _ = ...;
}

Or maybe if it's grammatically viable:

template addSize(...)
{
enum template = ...;
}

Re: review of std.parallelism

2011-03-19 Thread dsimcha


On 3/19/2011 4:35 PM, Andrei Alexandrescu wrote:

On 03/19/2011 12:16 PM, dsimcha wrote:

On 3/19/2011 12:03 PM, Andrei Alexandrescu wrote:

On 03/19/2011 02:32 AM, dsimcha wrote:

Ok, thanks again for clarifying **how** the docs could be improved.
I've
implemented the suggestions and generally given the docs a good reading
over and clean up. The new docs are at:

http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html


* Still no synopsis example that illustrates in a catchy way the most
attractive artifacts.


I don't see what I could put here that isn't totally redundant with the
rest of the documentation. Anything I could think of would basically
just involve concatentating all the examples. Furthermore, none of the
other Phobos modules have this, so I don't know what one should look
like.


I'm thinking along the lines of:

http://www.digitalmars.com/d/2.0/phobos/std_exception.html

A nice synopsis would be the pi computation. Just move that up to the
synopsis. It's simple, clean, and easy to relate to. Generally, you'd
put here not all details but the stuff you think would make it easiest
for people to get into your library.


In general I feel like std.parallelism is being held to a
ridiculously high standard that none of the other Phobos modules
currently meet.


I agree, but that goes without saying. This is not a double standard;
it's a simple means to improve quality of Phobos overall. Clearly
certain modules that are already in Phobos would not make it if proposed
today. And that's a good thing! Comparing our current ambitions to the
quality of the average Phobos module would not help us.

Generally it seems you have grown already tired of the exchange and it
would take only a few more exchanges for you to say, "you know what,
either let's get it over it or forget about it."

Please consider for a minute how this is the wrong tone and attitude to
be having on several levels. This is not a debate and not the place to
get defensive. Your role in the discussion is not symmetric with the
others' and at best you'd use the review as an opportunity to improve
your design, not stick heels in the ground to defend its current
incarnation (within reason). Your potential customers are attempting to
help you by asking questions (some of which no doubt are silly) and by
making suggestions (some of which, again, are ill-founded).
Nevertheless, we _are_ your potential customers and in a way the
customer is always right. Your art is to steer customers from what they
think they want to what you know they need - because you're the expert!
- and to improve design, nomenclature, and implementation to the extent
that would help them.

Furthermore, you should expect that the review process will prompt
changes. My perception is that you consider the submission more or less
final modulo possibly a few minor nits. You shouldn't. I'm convinced you
know much more about SMP than most or all others in this group, but in
no way that means your design has reached perfection and is beyond
improvement even from a non-expert.

I know you'd have no problem finding the right voice in this discussion
if you only frame it in the right light. Again, people are trying to
help (however awkwardly) and in no way is that ridiculous.


Fair enough.  Now that I think of it most of my frustration is that 
these details are only getting looked at now, when I have a week (and an 
otherwise very busy week) to fix all this stuff, when this module has 
been officially in review for the past two weeks and unofficially for 
several months.  I would be much more open to this process if the issues 
raised could be fixed at my leisure rather than on a hard and tight 
deadline.  This is exacerbated by the fact that I have another 
important, unrelated deadline, also next Friday.


At the same time, though, I'm afraid that if we didn't fix a vote date 
and put some urgency into things, the pace of the reviews would be 
glacial at best, like it was for the first two weeks of official review 
and the months of unofficial review.


How about we make next Friday a soft deadline/start of voting?  It can 
be extended as necessary by mutual agreement of me and Lars (the review 
manager), and likely will be if the review process is still yielding 
good suggestions and/or I haven't had time to implement/clean up some 
key things.  Having a deadline breathing down your neck like this is 
really not conducive to being open to suggestions and thoughtful 
consideration, especially for issues that seem like fairly minor details.


Also increasing the deadline pressure issue, Michael Fortin just caused 
me to rethink the issue of exception handling in parallel foreach.  I 
had more-or-less working code for this, but I realized it's severely 
broken in subtle ways that I've (knock on wood) never actually run into 
in real world code.  It's gonna take some time to fix.  These kinds of 
issues with error handling code can very easily slip under the radar in 
a librar

Re: On alias a = b

2011-03-19 Thread KennyTM~


On Mar 20, 11 05:14, Nick Sabalausky wrote:

"Andrei Alexandrescu"  wrote in message
news:im0g0n$1mal$1...@digitalmars.com...

On 3/18/11 3:28 PM, so wrote:

alias a(T) = b(T, known_type);

Would it be an overkill?


It's part of the evil plan.



What about stuff like this?:

 template foo() {}
 template foo(string str) {}
 template foo(int i) {}

 alias foo bar;

 // use bar!(), bar!("str") and bar!(7)

Currently that doesn't work (though I forget exactly how it fails). I've
been starting to find that to be more and more of a problem. I don't
rememebr if it works or not when foo is series of non-templated function
overloads. If not, it should, IMO.



Huh? It *does* work. See http://ideone.com/moPhm.


Or maybe that would be solved (at least for templates) under the proposed
evil plan with something like this?:

 alias a(T...) = b!(T);

Re: Is there a working web reader for this newsgroup?

2011-03-19 Thread Jesse Phillips

Kagamin Wrote:

> webnews doesn't show pages other than the first one, and pnews can't post.
> Is there a working webreader?

web-news shows all the pages. It is buggy at times an things will not always 
thread correctly, but it does show everything.

http://www.digitalmars.com/webnews/newsgroups.php?search_txt=&group=digitalmars.D

There is also http://news.gmane.org/gmane.comp.lang.d.general but posting 
doesn't work either.

Re: On alias a = b

2011-03-19 Thread Nick Sabalausky

"Andrei Alexandrescu"  wrote in message 
news:im0g0n$1mal$1...@digitalmars.com...
> On 3/18/11 3:28 PM, so wrote:
>> alias a(T) = b(T, known_type);
>>
>> Would it be an overkill?
>
> It's part of the evil plan.
>

What about stuff like this?:

template foo() {}
template foo(string str) {}
template foo(int i) {}

alias foo bar;

// use bar!(), bar!("str") and bar!(7)

Currently that doesn't work (though I forget exactly how it fails). I've 
been starting to find that to be more and more of a problem. I don't 
rememebr if it works or not when foo is series of non-templated function 
overloads. If not, it should, IMO.

Or maybe that would be solved (at least for templates) under the proposed 
evil plan with something like this?:

alias a(T...) = b!(T);

Re: On alias a = b

2011-03-19 Thread Nick Sabalausky

"KennyTM~"  wrote in message 
news:im0g86$1mr4$1...@digitalmars.com...
> On Mar 19, 11 04:28, so wrote:
>> alias a(T) = b(T, known_type);
>>
>> Would it be an overkill?
>
> If B is a template I think it's more consistent to add a '!':
>
>alias A(T) = B!(T, int);
>
> But I don't think it worth such generalization given the existing syntax 
> already works:
>
>template A(T) {
>  alias B!(T, int) A;
>  // alias A = B!(T, known_type);
>}

Templated functions already use the same sugar.

Re: On alias a = b

2011-03-19 Thread Nick Sabalausky

"KennyTM~"  wrote in message 
news:im0gtr$1o3q$1...@digitalmars.com...
> On Mar 19, 11 04:41, Steven Schveighoffer wrote:
>> On Fri, 18 Mar 2011 16:37:49 -0400, Andrei Alexandrescu
>>  wrote:
>>
>>> On 3/18/11 3:28 PM, so wrote:
 alias a(T) = b(T, known_type);

 Would it be an overkill?
>>>
>>> It's part of the evil plan.
>>
>> (I think there is a typo above, shouldn't it be alias a(T) = b!(T,
>> known_type) ? )
>>
>> You mean you would no longer need the surrounding template declaration?
>>
>> i.e. the above (corrected) statement would be short for:
>>
>> template a(T)
>> {
>> alias a = b!(T, known_type);
>> }
>>
>> That would be certainly very un-evil ;)
>>
>> -Steve
>
> Being evil would be:
>
> alias staticReduce(alias F, alias Init) = Init;
> alias staticReduce(alias F, alias Init, T...) if (T.length != 0) = 
> staticReduce!(F, F!(Init, T[0]), T[1..$]);
> //^ support conditionals?
> enum addSize(int total, T) = total + T.sizeof;
> // ^ yeah why not generalize to enum too?
>
> static assert(staticReduce!(addSize, 0, byte, short, int, long[2], float, 
> double) == 35);
>
>
> ;)

Tasty. I like it.

Re: Is there a working web reader for this newsgroup?

2011-03-19 Thread Ali Çehreli


On 03/19/2011 01:11 PM, Kagamin wrote:

webnews doesn't show pages other than the first one, and pnews can't post.
Is there a working webreader?


Probably you have already seen this page:

http://www.prowiki.org/wiki4d/wiki.cgi?NewsDmD#LanguageDevelopmentNewsgroup

I don't know of any usable newsreader. Perhaps it means that I am 
officially old no, or nobody cares about usability anymore. :/ Nothing 
seems to be designed for humans.


Ali

Re: review of std.parallelism

2011-03-19 Thread Andrei Alexandrescu


On 03/19/2011 12:16 PM, dsimcha wrote:

On 3/19/2011 12:03 PM, Andrei Alexandrescu wrote:

On 03/19/2011 02:32 AM, dsimcha wrote:

Ok, thanks again for clarifying **how** the docs could be improved. I've
implemented the suggestions and generally given the docs a good reading
over and clean up. The new docs are at:

http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html


* Still no synopsis example that illustrates in a catchy way the most
attractive artifacts.


I don't see what I could put here that isn't totally redundant with the
rest of the documentation. Anything I could think of would basically
just involve concatentating all the examples. Furthermore, none of the
other Phobos modules have this, so I don't know what one should look
like.


I'm thinking along the lines of:

http://www.digitalmars.com/d/2.0/phobos/std_exception.html

A nice synopsis would be the pi computation. Just move that up to the 
synopsis. It's simple, clean, and easy to relate to. Generally, you'd 
put here not all details but the stuff you think would make it easiest 
for people to get into your library.



In general I feel like std.parallelism is being held to a
ridiculously high standard that none of the other Phobos modules
currently meet.


I agree, but that goes without saying. This is not a double standard; 
it's a simple means to improve quality of Phobos overall. Clearly 
certain modules that are already in Phobos would not make it if proposed 
today. And that's a good thing! Comparing our current ambitions to the 
quality of the average Phobos module would not help us.


Generally it seems you have grown already tired of the exchange and it 
would take only a few more exchanges for you to say, "you know what, 
either let's get it over it or forget about it."


Please consider for a minute how this is the wrong tone and attitude to 
be having on several levels. This is not a debate and not the place to 
get defensive. Your role in the discussion is not symmetric with the 
others' and at best you'd use the review as an opportunity to improve 
your design, not stick heels in the ground to defend its current 
incarnation (within reason). Your potential customers are attempting to 
help you by asking questions (some of which no doubt are silly) and by 
making suggestions (some of which, again, are ill-founded). 
Nevertheless, we _are_ your potential customers and in a way the 
customer is always right. Your art is to steer customers from what they 
think they want to what you know they need - because you're the expert! 
- and to improve design, nomenclature, and implementation to the extent 
that would help them.


Furthermore, you should expect that the review process will prompt 
changes. My perception is that you consider the submission more or less 
final modulo possibly a few minor nits. You shouldn't. I'm convinced you 
know much more about SMP than most or all others in this group, but in 
no way that means your design has reached perfection and is beyond 
improvement even from a non-expert.


I know you'd have no problem finding the right voice in this discussion 
if you only frame it in the right light. Again, people are trying to 
help (however awkwardly) and in no way is that ridiculous.



* "After creation, Task objects are submitted to a TaskPool for
execution." I understand it's possible to use Task straight as a
promise/future, so s/are/may be/.


No. The only way Task is useful is by submitting it to a pool to be
executed. (Though this may change, see below.)


I very much hope this does change. Otherwise the role of Task in the 
design could be drastically reduced (e.g. nested type inside of 
TaskPool) without prejudice. At the minimum I want to be able to create 
a task, launch it, and check its result later without involving a pool. 
A pool is when I have many tasks that may exceed the number of CPUs etc. 
Simplicity would be great.


// start three reads
auto readFoo = task!readText("foo.txt");
auto readBar = task!readText("bar.txt");
auto readBaz = task!readText("baz.txt");
// join'em all
auto foo = readFoo.yieldWait();
auto bar = readBar.yieldWait();
auto baz = readBaz.yieldWait();

There's no need at this level for a task pool. What would be nice would 
be to have a join() that joins all tasks spawned by the current thread:


// start three reads
auto readFoo = task!readText("foo.txt");
auto readBar = task!readText("bar.txt");
auto readBaz = task!readText("baz.txt");
// join'em all
join();
// fetch results
auto foo = readFoo.spinWait();
auto bar = readBar.spinWait();
auto baz = readBaz.spinWait();

The way I see it is, task pools are an advanced means that coordinate m 
threads over n CPUs. If I don't care about that (as above) there should 
be no need for a pool at all. (Of course it's fine if used by the 
implementation.)



Also it is my understanding that
TaskPool efficiently adapts the concrete number of CPUs to an arbitrary
number of tasks. If that's the case, it's worth mentioni

Is there a working web reader for this newsgroup?

2011-03-19 Thread Kagamin

webnews doesn't show pages other than the first one, and pnews can't post.
Is there a working webreader?

incremental builds for D projects... challenging or close at hand?

2011-03-19 Thread Jason E. Aten

> On Sat, 19 Mar 2011 12:19:58 +0100, Jacob Carlborg wrote:
>> What are people's experiences with the various options for build
>> systems with D?
> 
> It's not very easy to make an incremental build system for D because of
> several reasons. Some are due to how the language works and some are due
> to how DMD works:
> 
> * DMD doesn't output all data in all the object files - This can perhaps
> be solved by compiling with the -lib switch
> 
> * When you change one D file you need to recompile ALL files that depend
> on the changed file. To compare with C/C++ which has source and header
> files you only need to recompile the source file if you change it
> 
> * DMD doesn't keep the fully qualified module name when naming object
> files resulting in foo.bar will conflict with bar.bar. Issue 3541.

[The above is from the packaging system discussion in the "a cabal for 
D?" thread; here I am branching this to a new topic because I'd like 
anyone interested in incremental build processes to notice and contribute 
if they have input.]

That is an interesting observation, Jacob. Thank you for pointing that 
out. 

Is there anything else (open question anyone) that would prevent D 
projects from doing incremental builds?  Lack of support for incremental 
builds is a show stopper.  Or in this case, the show would never get 
funded to begin with.

Or to ask it another way, what would it take to get incremental builds?

Re: review of std.parallelism

2011-03-19 Thread dsimcha


On 3/19/2011 3:36 PM, dsimcha wrote:

On 3/19/2011 2:35 PM, Michel Fortin wrote:
It gets rethrown when yieldWait()/spinWait()/workWait() is called. In
the case of the higher-level primitives, it gets re-thrown to the
calling thread at some non-deterministic point in the execution of these
functions. I didn't see the need to document this explicitly because it
"just works".



...Though now that you make me think of it I need to support exception 
chaining for the case of multiple concurrently thrown exceptions instead 
of just arbitrarily and non-deterministically rethrowing one.  The code 
that handles this was written before exception chaining existed.  Will 
get on that.

Re: review of std.parallelism

2011-03-19 Thread dsimcha


On 3/19/2011 2:35 PM, Michel Fortin wrote:

On 2011-03-19 13:16:02 -0400, dsimcha  said:


* "A goto from inside the parallel foreach loop to a label outside the
loop will result in undefined behavior." Would this be a bug in dmd?


No, it's because a goto of this form has no reasonable, useful
semantics. I should probably mention in the docs that the same applies
to labeled break and continue.

I have no idea what semantics these should have, and even if I did,
given the long odds that even one person would actually need them, I
think they'd be more trouble than they're worth to implement. For
example, once you break out of a parallel foreach loop to some
arbitrary address (and different threads can goto different labels,
etc.), well, it's no longer a parallel foreach loop. It's just a bunch
of completely unstructured threading doing god-knows-what.

Therefore, I slapped undefined behavior on it as a big sign that says,
"Just don't do it." This also has the advantage that, if anyone ever
thinks of any good, clearly useful semantics, these will be
implementable without breaking code later.


I think an improvement over undefined behaviour would be to throw an
exception.


The only problem is that there's no easy, well-documented way to tell 
from the return value of opApply whether it was a break, a goto, a 
labeled break/continue, etc.  This would be implementable only if I 
changed the semantics of break to also throw.  This might not be a bad 
thing (IMHO any type of breaking out of a parallel foreach loop is just 
silly) but others had asked for different semantics for break.




Also, what happens if one of the tasks throws an exception?



It gets rethrown when yieldWait()/spinWait()/workWait() is called.  In 
the case of the higher-level primitives, it gets re-thrown to the 
calling thread at some non-deterministic point in the execution of these 
functions.  I didn't see the need to document this explicitly because it 
"just works".

Re: review of std.parallelism

2011-03-19 Thread bearophile

dsimcha:

> 1.  I give up.
> 
> 2.  I wish someone had told me earlier.

Please, don't give up. It's a lot of time from the first time I've asked a 
parallel map in Phobos :-)

Bye,
bearophile

Re: std.parallelism: Final review

2011-03-19 Thread dsimcha


On 3/19/2011 3:08 PM, Caligo wrote:



On Fri, Mar 4, 2011 at 3:05 PM, Lars T. Kyllingstad
 wrote:

David Simcha has made a proposal for an std.parallelism module to be
included in Phobos.  We now begin the formal review process.

The code repository and documentation can be found here:

https://github.com/dsimcha/std.parallelism/wiki
http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html


Please review the code and the API, and post comments in this thread
within the next three weeks.

On 25 March I will start a new thread for voting over the inclusion of
the module.  Voting will last one week, until 1 April.  Votes cast
before
or after this will not be counted.

David, do you have any comments?

-Lars


Is std.parallelism better suited for data parallelism, task parallelism,
or both?


Both to some degree, but with more emphasis on data parallelism.


And how does it compare to something like OpenMP?


It was not **explicitly** designed to be an OMP killer, but supports 
parallel foreach(which can be made into parallel for using 
std.range.iota), and parallel reduce.  The synchronization primitives 
that OMP supports are already in druntime.


A major advantage over OpenMP is that std.parallelism is implemented 
within the language.  This means it's mostly portable across compilers 
and platforms and can easily be modified if you don't like something in 
it.  It also means that the syntax is more consistent with standard D 
syntax rather than being a bunch of weird looking pragmas.

Re: Quo vadis, D2? Thoughts on the D library ecosystem.

2011-03-19 Thread Jesse Phillips

David Nadlinger Wrote:

> Should we build something like a staging area, an incubator for 
> community contributions not taken yet through formal review, but of 
> interest for a wider audience? What about the etc.* package â would it 
> be an option to expand it into such an incubation area? If not, what 
> should it evolve into â a collection of C-level library bindings (see 
> the recent discussion on SQLite bindings started by David Simcha)? Who 
> will take care of the maintenance duties?

While growing the standard library is great, or developing a packaging system 
for D code is all great stuff. We really need a good central search repository. 
Something that gives maintenance,stability, and is search able by category.

That said, their really isn't a great number of mature, ready to use libraries 
yet. So why not just use what we already have, and keep it updated with these 
candidate libraries:

http://www.prowiki.org/wiki4d/wiki.cgi?DevelopmentWithD/Libraries

Re: review of std.parallelism

2011-03-19 Thread dsimcha


On 3/19/2011 2:25 PM, Michel Fortin wrote:

On 2011-03-19 14:14:51 -0400, Michel Fortin 
said:


I'm not too convinced about the "I know what I'm doing" argument when
I look at this example from asyncBuf's documentation:

auto lines = File("foo.txt").byLine();
auto duped = map!"a.idup"(lines); // Necessary b/c byLine() recycles
buffer

// Fetch more lines in the background while we process the lines already
// read into memory into a matrix of doubles.
double[][] matrix;
auto asyncReader = taskPool.asyncBuf(duped);

foreach(line; asyncReader) {
auto ls = line.split("\t");
matrix ~= to!(double[])(ls);
}

Look at the last line of the foreach. You are appending to a
non-shared array from many different threads. How is that not a race
condition?


 or maybe I just totally misunderstood asyncBuf. Rereading the
documentation I'm under the impression I'd have to write this to get
what I expected:

foreach (line; parallel(asyncReader))
...

And that would cause a race condition. If that's the case, the example
is fine. Sorry for the misunderstanding.



Right.  And this is pretty obviously a race.  The other example (without 
the parallel) is completely safe.

Re: std.parallelism: Final review

2011-03-19 Thread Caligo

On Fri, Mar 4, 2011 at 3:05 PM, Lars T. Kyllingstad
 wrote:

> David Simcha has made a proposal for an std.parallelism module to be
> included in Phobos.  We now begin the formal review process.
>
> The code repository and documentation can be found here:
>
>  https://github.com/dsimcha/std.parallelism/wiki
>  http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html
>
> Please review the code and the API, and post comments in this thread
> within the next three weeks.
>
> On 25 March I will start a new thread for voting over the inclusion of
> the module.  Voting will last one week, until 1 April.  Votes cast before
> or after this will not be counted.
>
> David, do you have any comments?
>
> -Lars
>

Is std.parallelism better suited for data parallelism, task parallelism, or
both?  And how does it compare to something like OpenMP?

Re: Dream package management system (Was: a cabal for D ?)

2011-03-19 Thread Jacob Carlborg


On 2011-03-19 17:05, Chris Manning wrote:

On 19/03/2011 14:36, Jacob Carlborg wrote:

On 2011-03-18 18:04, Chris Manning wrote:

On 17/03/2011 22:49, Jason E. Aten wrote:

Somewhat tongue in cheek, we could call it dabal.

As in, "get on dabal!" :-)


If D gets accepted for Google Summer of Code, I think this would be a
great idea for a project and I would be interested in implementing it as
a student. Although, it does seem overly ambitious so maybe only some of
this could be for the gsoc (and if I do this It'd be great to carry on
working on it anyway).

What does everybody think about this? Should I draw up a proposal of
some kind?

Chris


I've been thinking for quite some time to build a package management
system for D, lets call it dpac as an example. This is the ideas that I
have:

Basically copy how RubyGems works.
Use Ruby as a DSL for dpacsepc files which is used to create to create
the dpac file. This is an example for how a file used to build a package
could look like:

name "Foo Bar"
summary "This is the Foo Bar package"
version "1.0.0"
type :lib
author "Jacob Carlborg"
files ["lib.d"] # list of the files in the package
build :make # other options could be :dsss :cmake and so on
dversion 2 # D1 or D2

Build a dpac package out of the dpacspec file:

dpac foobar.dpacspec

Publish the package:

$ dpac publish foobar

Install the package:

$ dpac install foobar

A dpac package would just be a zip file (or some other type of archive)
containing all the necessary files to build the package and a file with
meta data.

All packages would be manged on a basic RESTful web server. Using GET to
download a package and POST to publish a package.

I'm working on a build system for D that I was thinking about to
integrate with the package management system. Then the build system
could track the files needed to build the package, making the "files"
attribute optional.

I also has a tool called DVM, https://bitbucket.org/doob/dvm , used for
installing and managing different versions of D compilers. I was
thinking about integrating DVM with the package management system to be
able to install different packages for different compilers.


I was thinking about something more similar to portage's ebuild system
or arch's AUR. This would mean that the sources could be stored anywhere
and just the info to build a package would be stored in a centralised
location.

For publishing these build "scripts", again, AUR's system comes to mind,
although perhaps something more controll(ed/able). Also, gentoo's
sunrise overlay.

Of course, this comes with the downside of the user having to compile
the packages on their end.

Chris


Oh, I forgot a part. Just as with RubyGems you would be able to install 
a package from a repository (git, mercurial or something else) as long 
as it contains the file containing the package specification file 
(dpacspec).


Of course you could manually downloading a package and installing it.

--
/Jacob Carlborg

Re: std.parallelism: Final review

2011-03-19 Thread Michel Fortin


On 2011-03-19 10:45:12 -0400, dsimcha  said:

I've added a priority property to TaskPool that allows setting the OS 
priority of the threads in the pool.  This just forwards to 
core.thread.priority(), so usage is identical.


Great.

Next to "priority" I notice the "makeDaemon" and "makeAngel" 
functions... wouldn't it make more sense to mirror the core.thread API 
for this too and make an "isDaemon" property out of these?



Since we don't know what the API for querying stuff like this should 
be, I had made it private.  I changed it to public.  I realized that, 
even if a more full-fledged API is added at some point for this stuff, 
there should be an obvious, convenient way to get it directly from 
std.parallelism anyhow, and it would be trivial to call whatever API 
eventually evolves to set this value.  Now, if you don't like the -1 
thing, you can just do:


auto pool = new TaskPool(osReportedNcpu);

or

defaultPoolThreads = osReportedNcpu;


Also good.

--
Michel Fortin
michel.for...@michelf.com
http://michelf.com/

Re: review of std.parallelism

2011-03-19 Thread Michel Fortin


On 2011-03-19 13:16:02 -0400, dsimcha  said:


* "A goto from inside the parallel foreach loop to a label outside the
loop will result in undefined behavior." Would this be a bug in dmd?


No, it's because a goto of this form has no reasonable, useful 
semantics.  I should probably mention in the docs that the same applies 
to labeled break and continue.


I have no idea what semantics these should have, and even if I did, 
given the long odds that even one person would actually need them, I 
think they'd be more trouble than they're worth to implement.  For 
example, once you break out of a parallel foreach loop to some 
arbitrary address (and different threads can goto different labels, 
etc.), well, it's no longer a parallel foreach loop.  It's just a bunch 
of completely unstructured threading doing god-knows-what.


Therefore, I slapped undefined behavior on it as a big sign that says, 
"Just don't do it."  This also has the advantage that, if anyone ever 
thinks of any good, clearly useful semantics, these will be 
implementable without breaking code later.


I think an improvement over undefined behaviour would be to throw an exception.

Also, what happens if one of the tasks throws an exception?

--
Michel Fortin
michel.for...@michelf.com
http://michelf.com/

Re: review of std.parallelism

2011-03-19 Thread Michel Fortin


On 2011-03-19 14:14:51 -0400, Michel Fortin  said:

I'm not too convinced about the "I know what I'm doing" argument when I 
look at this example from asyncBuf's documentation:


auto lines = File("foo.txt").byLine();
auto duped = map!"a.idup"(lines);  // Necessary b/c byLine() 
recycles buffer


// Fetch more lines in the background while we process the lines already
// read into memory into a matrix of doubles.
double[][] matrix;
auto asyncReader = taskPool.asyncBuf(duped);

foreach(line; asyncReader) {
auto ls = line.split("\t");
matrix ~= to!(double[])(ls);
}

Look at the last line of the foreach. You are appending to a non-shared 
array from many different threads. How is that not a race condition?


... or maybe I just totally misunderstood asyncBuf. Rereading the 
documentation I'm under the impression I'd have to write this to get 
what I expected:


foreach (line; parallel(asyncReader))
...

And that would cause a race condition. If that's the case, the example 
is fine. Sorry for the misunderstanding.


--
Michel Fortin
michel.for...@michelf.com
http://michelf.com/

Re: review of std.parallelism

2011-03-19 Thread Michel Fortin


On 2011-03-19 13:20:00 -0400, dsimcha  said:


On 3/19/2011 1:09 PM, Michel Fortin wrote:

For instance:

void main() {
int sum = 0;
foreach (int value; taskPool.parallel([0,2,3,6,1,4,6,3,3,3,6])) {
sum += value;
}
writeln(sum);
}

The "+=" would need to be an atomic operation to avoid low-level races.

I think that ParallelForeach's opApply should only accept a shared
delegate. I define shared delegate as a delegate that does not reference
any non-shared variables of its outer scope. The problem is that DMD
currently doesn't know how to determine whether a delegate literal is
shared or not, thus a delegate literal is never shared and if
ParallelForeach's opApply asked a shared delegate as it should it would
just not work. Fix DMD to create shared delegate literals where
appropriate and everything can be guarantied race-free.


If you want pedal-to-metal parallelism without insane levels of 
verbosity, you can't have these safety features.


I'm not sure where my proposal asks for more verbosity or less 
performance. All I can see is a few less casts in std.parallelism and 
that it'd disallow the case in my example above that is totally wrong. 
Either you're interpreting it wrong or there are things I haven't 
thought about (and I'd be happy to know about them).


But by looking at all the examples in the documentation, I cannot find 
one that would need to be changed... well, except the one I'll discuss 
below.



I thought long and hard about this issue before submitting this lib for 
review and concluded that any solution would make std.parallelism so 
slow, so limited and/or such a PITA to use that I'd rather it just punt 
these issues to the programmer.  In practice, parallel foreach is used 
with very small, performance-critical snippets that are fairly easy to 
reason about even without any language-level race safety.


I'm not too convinced about the "I know what I'm doing" argument when I 
look at this example from asyncBuf's documentation:


   auto lines = File("foo.txt").byLine();
   auto duped = map!"a.idup"(lines);  // Necessary b/c byLine() 
recycles buffer


   // Fetch more lines in the background while we process the lines already
   // read into memory into a matrix of doubles.
   double[][] matrix;
   auto asyncReader = taskPool.asyncBuf(duped);

   foreach(line; asyncReader) {
   auto ls = line.split("\t");
   matrix ~= to!(double[])(ls);
   }

Look at the last line of the foreach. You are appending to a non-shared 
array from many different threads. How is that not a race condition?


With my proposal, the compiler would have caught that because opApply 
would want the foreach body to be  a shared delegate, and 
reading/writing to non-shared variable "matrix" in the outer scope from 
a shared delegate literal would be an error.


I'm not too sure how hard it'd be to do that in the compiler, but I 
think it's the right thing to do. Once the compiler can do this we can 
have safety; until that time I'd agree to see std.parallelism stay 
unsafe.


--
Michel Fortin
michel.for...@michelf.com
http://michelf.com/

Re: review of std.parallelism

2011-03-19 Thread dsimcha


On 3/19/2011 12:03 PM, Andrei Alexandrescu wrote:


* "workUnitSize: The number of elements to evaluate in a single Task.
Must be less than or equal to bufSize, and in practice should be a
fraction of bufSize such that all worker threads can be used." Then why
not specify a different parameter such a multiplier or something? The
dependence between bufSize and worUnitSize is a sign that these two
should be improved. If you have good reasons that the user must have the
parameters in this form, give an example substantiating that.


I did this for consistency with all the other functions, where work unit 
size is specified directly.  I don't want lazyMap to be the oddball 
function where it's specified in a completely different way.

Re: review of std.parallelism

2011-03-19 Thread dsimcha


On 3/19/2011 1:09 PM, Michel Fortin wrote:

On 2011-03-19 12:03:51 -0400, Andrei Alexandrescu
 said:


* "Most of this module completely subverts..." Vague characterizations
("most", "completely", "some") don't belong in a technical
documentation. (For example there's either subversion going on or
there isn't.) Also, std.concurrency and std.parallelism address
different needs so there's little competition between them. Better:
"Unless explicitly marked as $(D @trusted) or $(D @safe), artifacts in
this module are not provably memory-safe and cannot be used with
SafeD. If used as documented, memory safety is guaranteed."


Actually, I think this is a bad description of what it subverts. What it
subverts isn't the memory-safety that SafeD provides, but the safety
against low-level races that even unsafe D protects against unless you
cast shared away. For instance:

void main() {
int sum = 0;
foreach (int value; taskPool.parallel([0,2,3,6,1,4,6,3,3,3,6])) {
sum += value;
}
writeln(sum);
}

The "+=" would need to be an atomic operation to avoid low-level races.

I think that ParallelForeach's opApply should only accept a shared
delegate. I define shared delegate as a delegate that does not reference
any non-shared variables of its outer scope. The problem is that DMD
currently doesn't know how to determine whether a delegate literal is
shared or not, thus a delegate literal is never shared and if
ParallelForeach's opApply asked a shared delegate as it should it would
just not work. Fix DMD to create shared delegate literals where
appropriate and everything can be guarantied race-free.



If you want pedal-to-metal parallelism without insane levels of 
verbosity, you can't have these safety features.  I thought long and 
hard about this issue before submitting this lib for review and 
concluded that any solution would make std.parallelism so slow, so 
limited and/or such a PITA to use that I'd rather it just punt these 
issues to the programmer.  In practice, parallel foreach is used with 
very small, performance-critical snippets that are fairly easy to reason 
about even without any language-level race safety.  This is a 
fundamental design decision that will not be changing.  If it's 
unacceptable then:


1.  I give up.

2.  I wish someone had told me earlier.

Re: review of std.parallelism

2011-03-19 Thread dsimcha


On 3/19/2011 12:03 PM, Andrei Alexandrescu wrote:

On 03/19/2011 02:32 AM, dsimcha wrote:

Ok, thanks again for clarifying **how** the docs could be improved. I've
implemented the suggestions and generally given the docs a good reading
over and clean up. The new docs are at:

http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html


* Still no synopsis example that illustrates in a catchy way the most
attractive artifacts.


I don't see what I could put here that isn't totally redundant with the 
rest of the documentation.  Anything I could think of would basically 
just involve concatentating all the examples.  Furthermore, none of the 
other Phobos modules have this, so I don't know what one should look 
like.  In general I feel like std.parallelism is being held to a 
ridiculously high standard that none of the other Phobos modules 
currently meet.




* "After creation, Task objects are submitted to a TaskPool for
execution." I understand it's possible to use Task straight as a
promise/future, so s/are/may be/.


No.  The only way Task is useful is by submitting it to a pool to be 
executed.  (Though this may change, see below.)



Also it is my understanding that
TaskPool efficiently adapts the concrete number of CPUs to an arbitrary
number of tasks. If that's the case, it's worth mentioning.


Isn't this kind of obvious from the examples, etc.?


* "If a Task has been submitted to a TaskPool instance, is being stored
in a stack frame, and has not yet finished, the destructor for this
struct will automatically call yieldWait() so that the task can finish
and the stack frame can be destroyed safely." At this point in the doc
the reader doesn't understand that at all because TaskPool has not been
seen yet. The reader gets worried that she'll be essentially serializing
the entire process by mistake. Either move this explanation down or
provide an example.


This is getting ridiculous.  There are too many circular dependencies 
between Task and TaskPool that are impossible to remove here that I'm 
not even going to try.  One or the other has to be introduced first, but 
neither can be explained without mentioning the other.  This is why I 
explain the relationship briefly in the module level summary, so that 
the user has at least some idea.  I think this is about the best I can do.




* Is done() a property?


Yes.  DDoc sucks.



* The example that reads two files at the same time should NOT use
taskPool. It's just one task, why would the pool ever be needed? If you
also provided an example that reads n files in memory at the same time
using a pool, that would illustrate nicely why you need it. If a Task
can't be launched without being put in a pool, there should be a
possibility to do so. At my work we have a simple function called
callInNewThread that does what's needed to launch a function in a new
thread.


I guess I could add something like this to Task.  Under the hood it 
would (for implementation simplicity, to reuse a bunch of code from 
TaskPool) fire up a new single-thread pool, submit the task, call 
TaskPool.finish(), and then return.  Since you're already creating a new 
thread, the extra overhead of creating a new TaskPool for the thread 
would be negligible and it would massively simplify the implementation. 
 My only concern is that, when combined with scoped versus non-scoped 
tasks (which are definitely here to stay, see below) this small 
convenience function would add way more API complexity than it's worth. 
 Besides, task pools don't need to be created explicitly anyhow. 
That's what the default instantiation is for.  I don't see how 
callInNewThread would really solve much.


Secondly, I think you're reading **WAY** too much into what was meant to 
be a simple example to illustrate usage mechanics.  This is another case 
where I can't think of a small, cute example of where you'd really need 
the pool.  There are plenty of larger examples, but the smallest/most 
self-contained one I can think of is a parallel sort.  I decided to use 
file reading because it was good enough to illustrate the mechanics of 
usage, even if it didn't illustrate a particularly good use case.




* The note below that example gets me thinking: it is an artificial
limitation to force users of Task to worry about scope and such. One
should be able to create a Future object (Task I think in your
terminology), pass it around like a normal value, and ultimately force
it. This is the case for all other languages that implement futures. I
suspect the "scope" parameter associated with the delegate a couple of
definitions below plays a role here, but I think we need to work for
providing the smoothest interface possible (possibly prompting
improvements in the language along the way).


This is what TaskPool.task is for.  Maybe this should be moved to the 
top of the definition of TaskPool and emphasized, and the scoped/stack 
allocated versions should be moved below TaskPool and de-emphasized?


At any rate,

Re: review of std.parallelism

2011-03-19 Thread Michel Fortin

On 2011-03-19 12:03:51 -0400, Andrei Alexandrescu 
 said:


* "Most of this module completely subverts..." Vague characterizations 
("most", "completely", "some") don't belong in a technical 
documentation. (For example there's either subversion going on or there 
isn't.) Also, std.concurrency and std.parallelism address different 
needs so there's little competition between them. Better: "Unless 
explicitly marked as $(D @trusted) or $(D @safe), artifacts in this 
module are not provably memory-safe and cannot be used with SafeD. If 
used as documented, memory safety is guaranteed."


Actually, I think this is a bad description of what it subverts. What 
it subverts isn't the memory-safety that SafeD provides, but the safety 
against low-level races that even unsafe D protects against unless you 
cast shared away. For instance:


void main() {
int sum = 0;
foreach (int value; taskPool.parallel([0,2,3,6,1,4,6,3,3,3,6])) 
{
sum += value;
}
writeln(sum);
}

The "+=" would need to be an atomic operation to avoid low-level races.

I think that ParallelForeach's opApply should only accept a shared 
delegate. I define shared delegate as a delegate that does not 
reference any non-shared variables of its outer scope. The problem is 
that DMD currently doesn't know how to determine whether a delegate 
literal is shared or not, thus a delegate literal is never shared and 
if ParallelForeach's opApply asked a shared delegate as it should it 
would just not work. Fix DMD to create shared delegate literals where 
appropriate and everything can be guarantied race-free.


--
Michel Fortin
michel.for...@michelf.com
http://michelf.com/

Re: Dream package management system (Was: a cabal for D ?)

2011-03-19 Thread Chris Manning


On 19/03/2011 14:36, Jacob Carlborg wrote:

On 2011-03-18 18:04, Chris Manning wrote:

On 17/03/2011 22:49, Jason E. Aten wrote:

Somewhat tongue in cheek, we could call it dabal.

As in, "get on dabal!" :-)


If D gets accepted for Google Summer of Code, I think this would be a
great idea for a project and I would be interested in implementing it as
a student. Although, it does seem overly ambitious so maybe only some of
this could be for the gsoc (and if I do this It'd be great to carry on
working on it anyway).

What does everybody think about this? Should I draw up a proposal of
some kind?

Chris


I've been thinking for quite some time to build a package management
system for D, lets call it dpac as an example. This is the ideas that I
have:

Basically copy how RubyGems works.
Use Ruby as a DSL for dpacsepc files which is used to create to create
the dpac file. This is an example for how a file used to build a package
could look like:

name "Foo Bar"
summary "This is the Foo Bar package"
version "1.0.0"
type :lib
author "Jacob Carlborg"
files ["lib.d"] # list of the files in the package
build :make # other options could be :dsss :cmake and so on
dversion 2 # D1 or D2

Build a dpac package out of the dpacspec file:

dpac foobar.dpacspec

Publish the package:

$ dpac publish foobar

Install the package:

$ dpac install foobar

A dpac package would just be a zip file (or some other type of archive)
containing all the necessary files to build the package and a file with
meta data.

All packages would be manged on a basic RESTful web server. Using GET to
download a package and POST to publish a package.

I'm working on a build system for D that I was thinking about to
integrate with the package management system. Then the build system
could track the files needed to build the package, making the "files"
attribute optional.

I also has a tool called DVM, https://bitbucket.org/doob/dvm , used for
installing and managing different versions of D compilers. I was
thinking about integrating DVM with the package management system to be
able to install different packages for different compilers.

I was thinking about something more similar to portage's ebuild system 
or arch's AUR. This would mean that the sources could be stored anywhere 
and just the info to build a package would be stored in a centralised 
location.


For publishing these build "scripts", again, AUR's system comes to mind, 
although perhaps something more controll(ed/able). Also, gentoo's 
sunrise overlay.


Of course, this comes with the downside of the user having to compile 
the packages on their end.


Chris

Re: review of std.parallelism

2011-03-19 Thread Andrei Alexandrescu


On 03/19/2011 10:28 AM, dsimcha wrote:

On 3/19/2011 10:54 AM, Andrei Alexandrescu wrote:

Towards the bottom of the document there are overloads of task that
don't have examples.


You mean TaskPool.task()? Since these are such slight variations of the
other overloads, I thought an example would be overkill. Since people
less familiar with the library don't think so, though, I've added
examples that are accordingly slight variations of the examples for the
other overloads.


A great way to handle bunches of almost-identical overloads is to group 
them together with /// ditto and explain the slight differences in the 
consolidated documentation.


Andrei

Re: review of std.parallelism

2011-03-19 Thread Andrei Alexandrescu


On 03/19/2011 02:32 AM, dsimcha wrote:

Ok, thanks again for clarifying **how** the docs could be improved. I've
implemented the suggestions and generally given the docs a good reading
over and clean up. The new docs are at:

http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html


* When using "parallelism" as a common noun, prefix is with a '_' so 
ddoc doesn't underline it.


* "Most of this module completely subverts..." Vague characterizations 
("most", "completely", "some") don't belong in a technical 
documentation. (For example there's either subversion going on or there 
isn't.) Also, std.concurrency and std.parallelism address different 
needs so there's little competition between them. Better: "Unless 
explicitly marked as $(D @trusted) or $(D @safe), artifacts in this 
module are not provably memory-safe and cannot be used with SafeD. If 
used as documented, memory safety is guaranteed."


* Speaking of std.concurrency vs. std.parallelism, the first paragraph 
might be something like: "This module implements high-level primitives 
for shared memory SMP parallelism. These include parallel foreach, 
parallel reduce, parallel eager map, pipelining and future/promise 
parallelism primitives. $(D std._parallelism) is best recommended when 
the same operation is to be executed in parallel over different data. 
For communication between arbitrary threads, see $(D std.concurrency)."


* Still no synopsis example that illustrates in a catchy way the most 
attractive artifacts.


* "After creation, Task objects are submitted to a TaskPool for 
execution." I understand it's possible to use Task straight as a 
promise/future, so s/are/may be/. Also it is my understanding that 
TaskPool efficiently adapts the concrete number of CPUs to an arbitrary 
number of tasks. If that's the case, it's worth mentioning.


* "A call to workWait(), yieldWait(), or spinWait() can be used to 
retrive the return value after the function is finished executing." As 
an aside, a spell checking step would be great ("retrieve") - just put 
the text in an editor with spellchecking. I think what this means is: 
"The methods workWait(), yieldWait(), or spinWait() make sure that the 
function finishes execution and then return its result to the initiating 
thread. Each uses a different waiting strategy as detailed below."


* "If a Task has been submitted to a TaskPool instance, is being stored 
in a stack frame, and has not yet finished, the destructor for this 
struct will automatically call yieldWait() so that the task can finish 
and the stack frame can be destroyed safely." At this point in the doc 
the reader doesn't understand that at all because TaskPool has not been 
seen yet. The reader gets worried that she'll be essentially serializing 
the entire process by mistake. Either move this explanation down or 
provide an example.


* "Function results are returned from yieldWait() and friends by ref." 
Someone coming from C++ may be thrown off by this sudden casual use of 
"friends" and think there's a notion of frienship by reference in D. 
Better: "The forcing methods yieldWait(), workWait(), and spinWait() 
return the result by reference."


* Speaking of which, I'd replace "Wait" with "Force". Right now the 
nomenclature is far removed from futures and promises.


* Is done() a property?

* The example that reads two files at the same time should NOT use 
taskPool. It's just one task, why would the pool ever be needed? If you 
also provided an example that reads n files in memory at the same time 
using a pool, that would illustrate nicely why you need it. If a Task 
can't be launched without being put in a pool, there should be a 
possibility to do so. At my work we have a simple function called 
callInNewThread that does what's needed to launch a function in a new 
thread.


* The note below that example gets me thinking: it is an artificial 
limitation to force users of Task to worry about scope and such. One 
should be able to create a Future object (Task I think in your 
terminology), pass it around like a normal value, and ultimately force 
it. This is the case for all other languages that implement futures. I 
suspect the "scope" parameter associated with the delegate a couple of 
definitions below plays a role here, but I think we need to work for 
providing the smoothest interface possible (possibly prompting 
improvements in the language along the way).


* I'm not sure how to interpret the docs for

ReturnType!(F) run(F, Args...)(F fpOrDelegate, ref Args args);

So it's documented but I'm not supposed to care. Why not just remove? 
Surely there must be higher-level examples that clarify that I can use 
delegates etc.


* The examples have code at top-level. That's fine for short snippets 
but not when using import etc. I recommend putting the code inside 
unittests or function bodies for such cases.


* "If you want to escape the Task object from the function in which it 
was created or prefer to heap alloca

'where' statement part II

2011-03-19 Thread bearophile

These are mostly weekend musings.
I've found another possible usage for the 'where' statement. But first let me 
introduce the topic better.

This page contains a little problem:
http://csokavar.hu/blog/2010/04/20/problem-of-the-week-9-digit-problem/

The problem:
Find a number consisting of 9 digits in which each of the digits from 1 to 9 
appears only once. This number must also satisfy these divisibility 
requirements:
1. The number should be divisible by 9.
2. If the rightmost digit is removed, the remaining number should be divisible 
by 8.
3. If the rightmost digit of the new number is removed, the remaining number 
should be divisible by 7.
4. And so on, until theres only one digit (which will necessarily be divisible 
by 1).


A solution using just C++0x templates (not found by me, modified):

-

// g++ -ftemplate-depth-2000 -std=c++0x nine_digits.cpp
#include "stdio.h"

template
struct Value;

template<>
struct Value<> {
static const int v = 0;
};

template
struct Value {
static int const v = 10 * Value::v + first;
};

template
struct Contains;

template
struct Contains {
static const bool v = false;
};

template
struct Contains {
static const bool v = elem == first || Contains::v;
};

template
struct DivisorTest;

template<>
struct DivisorTest<> {
static const bool v = true;
};

template
struct DivisorTest {
static const int num = Value::v;
static const int div = sizeof...(rest) + 1;
static const int mod = num % div;
static const bool v = mod == 0;
};

template
struct TestCandidate {
static const bool v = (DivisorTest::v && !Contains::v);
};

template
struct Search;

template
struct SearchI {
static const int v = Search::v;
};

template
struct SearchI {
static const int v = Value::v;
};

template
struct SearchI {
static const int v = Search::v;
};

template
struct Search {
static const bool good = TestCandidate::v;
static const bool final = good && 1 + sizeof...(rest) == length;
static const int v = SearchI::v;
};

template
struct Search {
static const int v = -1;
};

template
struct Search {
static const int v = Search::v;
};

template
struct Search {
static const int v = Search::v;
};

int main() {
printf("%d\n", Search<7>::v);
return 0;
}

-

A translation of the C++0x to Haskell, from here, modified:
http://gergo.erdi.hu/cs/ninedigits/lorentey-c%2B%2B-tmp/lorentey-c%2B%2B-tmp.hs


value :: [Int] -> Int
value [] = 0
value (first:rest) = 10 * (value rest) + first

contains :: Int -> [Int] -> Bool
contains elem [] = False
contains elem (first:rest) = elem == first || (contains elem rest)

divisor_test :: [Int] -> Bool
divisor_test (first:rest) = mod' == 0
where num = value (first:rest)
  div = (length rest) + 1
  mod' = num `mod` div

test_candidate :: [Int] -> Bool
test_candidate (first:rest) = divisor_test (first:rest) && (not (contains first 
rest))

search_i :: Int -> Bool -> Bool -> [Int] -> Int
search_i len True True  (digit:rest) = value (digit:rest)
search_i len True False (digit:rest) = search len (1:digit:rest)
search_i len good final (digit:rest) = search len (digit+1:rest)

search :: Int -> [Int] -> Int
search len [] = search len [1]
search len [10]   = -1
search len (10:next:rest) = search len ((next+1):rest)
search len (digit:rest)   = search_i len good final (digit:rest)
where good = test_candidate (digit:rest)
  final = good && 1 + (length rest) == len

main = print $ search 9 []

-

A translation of the Haskell code to D2 templates:


import core.stdc.stdio: printf;
import std.typetuple: allSatisfy;

template IsInteger(alias x) {
enum bool IsInteger = is(typeof(x) == int);
}

template AreAllIntegers(args...) {
enum bool AreAllIntegers = allSatisfy!(IsInteger, args);
}

template value(args...) if (AreAllIntegers!args) {
static if (args.length == 0)
enum int value = 0;
else
enum int value = 10 * value!(args[1..$]) + args[0];
}

template contains(int elem, args...) if (AreAllIntegers!args) {
static if (args.length == 0)
enum bool contains = false;
else
enum bool contains = elem == args[0] || contains!(elem, args[1..$]);
}

template divisor_test(args...) if (AreAllIntegers!args) {
enum bool divisor_test = (value!args % args.length) == 0;
}

template test_candidate(args...) if (AreAllIntegers!args) {
enum bool test_candidate = divisor_test!args && !(contains!args);
}

template search_i(int length, bool good, bool isFinal, digits...) if 
(AreAllIntegers!digits) {
static if (good && isFinal)
enum int search_i = value!digits;
else static if (good && !isFinal)
enum int search_i = search!(length, 1, digits);
else
enum int search_i = search!(length, digits[0]+1, digits[1..$]);
}

template search(int length, digits...) if (AreAllIntegers!digits) {
static if (digits.length == 0)
enum int search = search!(length,

Trivial DMD fixes: GitHub pull requests vs. Bugzilla issues

2011-03-19 Thread David Nadlinger

For almost a month now, I have a trivial pull request open for DMD: 
https://github.com/D-Programming-Language/dmd/pull/10. It's only about 
adding the word »length« in two places to clarify the tuple 
out-of-bounds error message, so I didn't bother to open a ticket for it 
because I figured that it would only create unneeded administrative 
overhead for such a small change.


However, given that the commit has not been merged yet: Walter, do you 
still prefer Bugzilla issues for this kind of patches?


David

Re: review of std.parallelism

2011-03-19 Thread dsimcha


On 3/19/2011 10:54 AM, Andrei Alexandrescu wrote:

On 03/18/2011 11:40 PM, dsimcha wrote:


It should just be private. The fact that it's public is an artifact of
when I was designing worker-local storage and didn't know how it was
going to work yet. I never thought to revisit this until now. It really
isn't useful to client code.


It could be public and undocumented.


I've already made it private.  I can't see what purpose having it public 
would serve.  The fact that it was public before was **purely** an 
oversight.



* defaultPoolThreads - should it be a @property?


Yes. In spirit it's a global variable. It requires some extra
machinations, though, to be threadsafe, which is why it's not
implemented as a simple global variable.


Then it should be a @property. I think ddoc doesn't reflect that, but an
example could.


Right.  It's always been @property but ddoc doesn't reflect this.  I've 
changed the docs slightly to call it a "property" instead of a 
"function".  It seems like overkill to me to give examples for this, 
though, since it's just a getter and a setter.





* No example for task().


 Yes there is, for both flavors, though these could admittedly be
improved. Only the safe version doesn't have an example, and this is
just a more restricted version of the function pointer case, so it seems
silly to make a separate example for it.


Towards the bottom of the document there are overloads of task that
don't have examples.


You mean TaskPool.task()?  Since these are such slight variations of the 
other overloads, I thought an example would be overkill.  Since people 
less familiar with the library don't think so, though, I've added 
examples that are accordingly slight variations of the examples for the 
other overloads.





* What is 'run' in the definition of safe task()?


It's just the run() adapter function. Isn't that obvious?


I'm referring to this:

Task!(run,TypeTuple!(F,Args)) task(F, Args...)(scope F delegateOrFp,
Args args);

What is "run"?


Ok, I ended up moving the docs for run() directly above the stuff that 
uses it.  run() is described as "Calls a delegate or function pointer 
with args. This is an adapter that makes Task work with delegates, 
function pointers and functors instead of just aliases. It is included 
in the documentation to clarify how this case is handled, but is not 
meant to be used directly by client code."


I know the Higher Principles of Encapsulation say this should be private 
and the relevant overloads should return auto.  I strongly believe, 
though, that being anal about encapsulation of this detail is silly 
since it is so unlikely to change and that exposing it helps to clarify 
what's really going on here.  Encapsulation is good up to a point, but 
sometimes it's just easier to think about things when you know how they 
really work at a concrete level, and this tradeoff needs to be weighted.

Re: review of std.parallelism

2011-03-19 Thread Andrei Alexandrescu


On 03/18/2011 11:40 PM, dsimcha wrote:

Thanks for the advice. You mentioned in the past that the documentation
was inadequate but didn't give enough specifics as to how until now. As
the author of the library, things seem obvious to me that don't seem
obvious to anyone else, so I don't feel that I'm in a good position to
judge the quality of the documentation and where it needs improvement. I
plan to fix most of the issues you raised, but I've left comments for
the few that I can't/won't fix or believe are based on misunderstandings
below.


Great, thanks.


On 3/18/2011 11:29 PM, Andrei Alexandrescu wrote:

1. Library proper:

* "In the case of non-random access ranges, parallel foreach is still
usable but buffers lazily to an array..." Wouldn't strided processing
help? If e.g. 4 threads the first works on 0, 4, 8, ... second works on
1, 5, 9, ... and so on.


You can have this if you want, by setting the work unit size to 1.
Setting it to a larger size just causes more elements to be buffered,
which may be more efficient in some cases.


Got it.


* Why not make workerIndex a ulong and be done with it?


I doubt anyone's really going to create anywhere near 4 billion TaskPool
threads over the lifetime of a program. Part of the point of TaskPool is
recycling threads rather than paying the overhead of creating and
destroying them. Using a ulong on a 32-bit architecture would make
worker-local storage substantially slower. workerIndex is how
worker-local storage works under the hood, so it needs to be fast.


If you're confident that overflow won't occur, you may want to eliminate 
that detail from the docs. It throws off the reader.



 > * No example for workerIndex and why it's useful.

It should just be private. The fact that it's public is an artifact of
when I was designing worker-local storage and didn't know how it was
going to work yet. I never thought to revisit this until now. It really
isn't useful to client code.


It could be public and undocumented.


* Is stop() really trusted or just unsafe? If it's forcibly killing
threads then its unsafe.


It's not forcibly killing threads. As the documentation states, it has
no effect on jobs already executing, only ones in the queue.
Furthermore, it's needed unless makeDaemon is called. Speaking of which,
makeDaemon and makeAngel should probably be trusted, too.


Great. The more safe/trusted, the better.


* defaultPoolThreads - should it be a @property?


Yes. In spirit it's a global variable. It requires some extra
machinations, though, to be threadsafe, which is why it's not
implemented as a simple global variable.


Then it should be a @property. I think ddoc doesn't reflect that, but an 
example could.



* No example for task().


 Yes there is, for both flavors, though these could admittedly be
improved. Only the safe version doesn't have an example, and this is
just a more restricted version of the function pointer case, so it seems
silly to make a separate example for it.


Towards the bottom of the document there are overloads of task that 
don't have examples.



* What is 'run' in the definition of safe task()?


It's just the run() adapter function. Isn't that obvious?


I'm referring to this:

Task!(run,TypeTuple!(F,Args)) task(F, Args...)(scope F delegateOrFp, 
Args args);


What is "run"?


Andrei

Re: std.parallelism: Final review

2011-03-19 Thread dsimcha


On 3/19/2011 9:37 AM, Michel Fortin wrote:

On 2011-03-18 22:27:14 -0400, dsimcha  said:


I think your use case is both beyond the scope of std.parallelism and
better handled by std.concurrency. std.parallelism is mostly meant to
handle the pure multicore parallelism use case. It's not that it
**can't** handle other use cases, but that's not what it's tuned for.


I know. But if this gets its way in the standard library, perhaps it
should aim at reaching a slightly wider audience? Especially since it
lacks so little to become more general purpose...


Fair enough.  You've convinced me, since I've just recently started 
pushing std.parallelism in this direction in both my research work and 
in some of the examples I've been using, and you've given very good 
specific suggestions about **how** to expand things a little.






As far as prioritization, it wouldn't be hard to implement
prioritization of when a task starts (i.e. have a high- and
low-priority queue). However, the whole point of TaskPool is to avoid
starting a new thread for each task. Threads are recycled for
efficiency. This prevents changing the priority of things in the OS
scheduler. I also don't see how to generalize prioritization to map,
reduce, parallel foreach, etc. w/o making the API much more complex.


I was not talking about thread priority, but ordering priority (which
task gets chosen first). I don't really care about thread priority in my
application, and I understand that per-task thread priority doesn't make
much sense. If I needed per-task thread priority I'd simply make pools
for the various thread priorities and put tasks in the right pools.

That said, perhaps I could do exactly that: create two or three pools
with different thread priorities, put tasks into the right pool and let
the OS sort out the scheduling. But then the question becomes: how do I
choose the thread priority of a task pool? I doesn't seem possible from
the documentation. Perhaps TaskPool's constructor should have a
parameter for that.



This sounds like a good solution.  The general trend I've seen is that 
the ability to create >1 pools elegantly solves a lot of problems that 
would be a PITA from both an interface and an implementation perspective 
to solve more directly.  I've added a priority property to TaskPool that 
allows setting the OS priority of the threads in the pool.  This just 
forwards to core.thread.priority(), so usage is identical.



- - -

Another remarks: in the documentation for the TaskPool constructor, it
says:

""Default constructor that initializes a TaskPool with one worker thread
for each CPU reported available by the OS, minus 1 because the thread
that initialized the pool will also do work.""

This "minus 1" thing doesn't really work for me. It certainly make sense
for a parallel foreach use case -- whenever the current thread would
block until the work is done you can use that thread to work too -- but
in my use case I delegate all the work to other threads because my main
thread isn't a dedicated working thread and it must not block. I'd be
nice to have a boolean parameter for the constructor to choose if the
main thread will work or not (and whether it should do minus 1 or not).

For the global taskPool, I guess I would just have to write
"defaultPoolThreads = defaultPoolThreads+1" at the start of the program
if the main thread isn't going to be working.




I've solved this, though in a slightly different way.  Based on 
discussions on this newsgroup I had recently added an osReportedNcpu 
variable to std.parallelism instead of using core.cpuid.  This is an 
immutable global variable that is set in a static this() statement.


Since we don't know what the API for querying stuff like this should be, 
I had made it private.  I changed it to public.  I realized that, even 
if a more full-fledged API is added at some point for this stuff, there 
should be an obvious, convenient way to get it directly from 
std.parallelism anyhow, and it would be trivial to call whatever API 
eventually evolves to set this value.  Now, if you don't like the -1 
thing, you can just do:


auto pool = new TaskPool(osReportedNcpu);

or

defaultPoolThreads = osReportedNcpu;

Quo vadis, D2? Thoughts on the D library ecosystem.

2011-03-19 Thread David Nadlinger

While lying in the bed with fever yesterday (so please excuse any 
careless mistakes), I was pondering a bit about the current discussions 
regarding Phobos additions, package management, etc. It occurred to me 
that there is a central unanswered question, which I think deserves to 
be broadly discussed right now.


But first, let me start out by describing how I see current situation 
regarding D2. Leaving aside a few minor things like @property 
enforcement or the recent suggestions about a new alias syntax, the 
language is fairly stable and critical bugs in DMD 2 are not frequent 
enough to make it completely unusable for day-to-day development 
anymore. Of course, there is still a large way to go for the D toolchain 
(with the ideal result being a rock-solid self-hosting compiler 
front-end, usable as a library as well), but in a sense, we are more or 
less at the end of a certain stage of D2 development.


I think most of you would agree with me if I say that the main goal for 
D2 right now should be to build a vibrant library ecosystem around the 
language, to foster adoption in real-world applications. There has been 
a number of related discussions recently, but as mentioned above, I 
think there is a central question:


Have we reached the critical mass yet where it makes sense to split the 
effort in a number of smaller library projects, or are we off better 
with concentrating on a central, comprehensive standard library 
(Phobos), considering the current community size?


I do not really have an answer to this question, but here are a few 
thoughts on the topic, which might also help to make clearer what I mean:


I think that adopting a Boost-like review process for Phobos has 
certainly been a clever and valuable move, for more than one reason. 
First, together with the move to Git, it has helped to reinforce the 
point that D2 and Phobos are open to contributions from everyone, given 
that they meet certain quality standards. Second, it certainly boosts 
code quality of further standard library additions, which had been a 
problem for some parts in the past (at least from my point of view, no 
offense intended). Third, and this overlaps with another point below, I 
think that the quality improvements will also help to reduce bit rot, 
which has traditionally been a problem with D libraries.


But however good a fit this model is for the standard library, I think 
it is no silver bullet either. There are small, one-off style projects, 
arising from a central need, where the amount of time needed to get the 
code through the whole review process is prohibitive – even if the code 
quality was high enough –, but the result is still usable for the wide 
public. Common examples for this would be low-level wrappers for C 
libraries, although they don't really qualify for inclusion into Phobos 
for other reasons (often, another wrapper layer is needed to be usable 
with common D idioms). Also, people new to the language might be scared 
away by the mere thought of contributing to a standard library. How to 
make sure that these libraries are not forgotten? Maybe a central 
package system with SCM (Git, …) integration can help here?


And, which brings me to the next point, how to fight the unfavorable 
outcome of having a huge inscrutable pile of half-finished bit-rotten 
code, a problem that DSource is currently experiencing? A central, 
well-maintained standard library effort with a wider scope could 
certainly help to reduce this problem, at least from the (D) user side, 
but on the other hand, larger amounts of code de facto becoming 
unmaintained would be a problem for it as well.


Should we build something like a staging area, an incubator for 
community contributions not taken yet through formal review, but of 
interest for a wider audience? What about the etc.* package – would it 
be an option to expand it into such an incubation area? If not, what 
should it evolve into – a collection of C-level library bindings (see 
the recent discussion on SQLite bindings started by David Simcha)? Who 
will take care of the maintenance duties?


Looking forward to a stimulating discussion,
David

Re: Dream package management system (Was: a cabal for D ?)

2011-03-19 Thread Jacob Carlborg


On 2011-03-18 18:04, Chris Manning wrote:

On 17/03/2011 22:49, Jason E. Aten wrote:

Somewhat tongue in cheek, we could call it dabal.

As in, "get on dabal!" :-)


If D gets accepted for Google Summer of Code, I think this would be a
great idea for a project and I would be interested in implementing it as
a student. Although, it does seem overly ambitious so maybe only some of
this could be for the gsoc (and if I do this It'd be great to carry on
working on it anyway).

What does everybody think about this? Should I draw up a proposal of
some kind?

Chris


I've been thinking for quite some time to build a package management 
system for D, lets call it dpac as an example. This is the ideas that I 
have:


Basically copy how RubyGems works.
Use Ruby as a DSL for dpacsepc files which is used to create to create 
the dpac file. This is an example for how a file used to build a package 
could look like:


name "Foo Bar"
summary "This is the Foo Bar package"
version "1.0.0"
type :lib
author "Jacob Carlborg"
files ["lib.d"] # list of the files in the package
build :make # other options could be :dsss :cmake and so on
dversion 2 # D1 or D2

Build a dpac package out of the dpacspec file:

dpac foobar.dpacspec

Publish the package:

$ dpac publish foobar

Install the package:

$ dpac install foobar

A dpac package would just be a zip file (or some other type of archive) 
containing all the necessary files to build the package and a file with 
meta data.


All packages would be manged on a basic RESTful web server. Using GET to 
download a package and POST to publish a package.


I'm working on a build system for D that I was thinking about to 
integrate with the package management system. Then the build system 
could track the files needed to build the package, making the "files" 
attribute optional.


I also has a tool called DVM, https://bitbucket.org/doob/dvm , used for 
installing and managing different versions of D compilers. I was 
thinking about integrating DVM with the package management system to be 
able to install different packages for different compilers.


--
/Jacob Carlborg

Re: Has the ban on returning function nested structs been lifted?

2011-03-19 Thread spir


On 03/19/2011 01:40 PM, Simen kjaeraas wrote:

On Sat, 19 Mar 2011 13:05:59 +0100, spir  wrote:


I guess something similar should be the base design of ranges. "Range of X"
could simply mean "lazy sequence of X", an on-demand array (lol); and that
would be the return type of every function returning a range. The complexity
(of filter-ing, map-ping, find-ind) could be hidden inside the object, not
exposed in the outer type.


Such a scheme precludes the usage of structs as ranges, though. It would
require virtual functions.


Oh, yes, seems you're right. Too bad.

Denis
--
_
vita es estrany
spir.wikidot.com

Re: internal representation of struct

2011-03-19 Thread Trass3r


Am 18.03.2011, 18:33 Uhr, schrieb Daniel Gibson :

Why not do something like:
struct vec3f {
   float[3] xyz;
   @property float x() { return xyz[0]; } // read x
   @property float x(float val) { return xyz[0] = val; } // write x
   // and the same for y and z ...
   // ... and your own functions
}


You could also use unions to achieve that.

Re: std.parallelism: Final review

2011-03-19 Thread Michel Fortin


On 2011-03-18 22:27:14 -0400, dsimcha  said:

I think your use case is both beyond the scope of std.parallelism and 
better handled by std.concurrency.  std.parallelism is mostly meant to 
handle the pure multicore parallelism use case.  It's not that it 
**can't** handle other use cases, but that's not what it's tuned for.


I know. But if this gets its way in the standard library, perhaps it 
should aim at reaching a slightly wider audience? Especially since it 
lacks so little to become more general purpose...



As far as prioritization, it wouldn't be hard to implement 
prioritization of when a task starts (i.e. have a high- and 
low-priority queue).  However, the whole point of TaskPool is to avoid 
starting a new thread for each task.  Threads are recycled for 
efficiency.  This prevents changing the priority of things in the OS 
scheduler.  I also don't see how to generalize prioritization to map, 
reduce, parallel foreach, etc. w/o making the API much more complex.


I was not talking about thread priority, but ordering priority (which 
task gets chosen first). I don't really care about thread priority in 
my application, and I understand that per-task thread priority doesn't 
make much sense. If I needed per-task thread priority I'd simply make 
pools for the various thread priorities and put tasks in the right 
pools.


That said, perhaps I could do exactly that: create two or three pools 
with different thread priorities, put tasks into the right pool and let 
the OS sort out the scheduling. But then the question becomes: how do I 
choose the thread priority of a task pool? I doesn't seem possible from 
the documentation. Perhaps TaskPool's constructor should have a 
parameter for that.



In addition, std.parallelism guarantees that tasks will be started in 
the order that they're submitted, except that if the results are needed 
immediately and the task hasn't been started yet, it will be pulled out 
of the middle of the queue and executed immediately.  One way to get 
the prioritization you need is to just submit the tasks in order of 
priority, assuming you're submitting them all from the same place.


Most of my tasks are background tasks that just need to be done 
eventually while others are user-requested tasks which can be requested 
at any time in the main thread. Issuing them serially is not really an 
option.



One last thing:  As far as I/O goes, AsyncBuf may be useful.  This 
allows you to pipeline reading of a file and higher level processing. 
Example:


// Read the lines of a file into memory in parallel with processing
// them.
import std.stdio, std.parallelism, std.algorithm;

void main() {
 auto lines = map!"a.idup"(File("foo.txt").byLine());
 auto pipelined = taskPool.asyncBuf(lines);

 foreach(line; pipelined) {
 auto ls = line.split("\t");
 auto nums = to!(double[])(ls);
 }
}


Looks nice, but doesn't really work for what I'm doing. Currently I 
have one task per file, each task reading a relatively small file and 
then parsing its content.


- - -

Another remarks: in the documentation for the TaskPool constructor, it says:

""Default constructor that initializes a TaskPool with one worker 
thread for each CPU reported available by the OS, minus 1 because the 
thread that initialized the pool will also do work.""


This "minus 1" thing doesn't really work for me. It certainly make 
sense for a parallel foreach use case -- whenever the current thread 
would block until the work is done you can use that thread to work too 
-- but in my use case I delegate all the work to other threads because 
my main thread isn't a dedicated working thread and it must not block. 
I'd be nice to have a boolean parameter for the constructor to choose 
if the main thread will work or not (and whether it should do minus 1 
or not).


For the global taskPool, I guess I would just have to write 
"defaultPoolThreads = defaultPoolThreads+1" at the start of the program 
if the main thread isn't going to be working.



--
Michel Fortin
michel.for...@michelf.com
http://michelf.com/

Re: review of std.parallelism

2011-03-19 Thread Simen kjaeraas


On Sat, 19 Mar 2011 13:51:56 +0100, dsimcha  wrote:


== Quote from Simen kjaeraas (simen.kja...@gmail.com)'s article

On Sat, 19 Mar 2011 05:40:08 +0100, dsimcha  wrote:
> On 3/18/2011 11:29 PM, Andrei Alexandrescu wrote:
>> 1. Library proper:
>>
>> * "In the case of non-random access ranges, parallel foreach is still
>> usable but buffers lazily to an array..." Wouldn't strided processing
>> help? If e.g. 4 threads the first works on 0, 4, 8, ... second works  
on

>> 1, 5, 9, ... and so on.
>
> You can have this if you want, by setting the work unit size to 1.
> Setting it to a larger size just causes more elements to be buffered,
> which may be more efficient in some cases.
Please add an example showing that, too. Sure, the documentation says
that's what's being done, but an example would show it more clearly.


I don't understand how this can be demonstrated in an example.  It's an
under-the-hood thing.  The only place this appears in the API is in the
workUnitSize parameter.


Yeah, scratch that. I for some reason thought the "array of size
workUnitSize" was global, but it's per thread, innit? Seems so logical now.


--
Simen

Re: review of std.parallelism

2011-03-19 Thread dsimcha

== Quote from Simen kjaeraas (simen.kja...@gmail.com)'s article
> On Sat, 19 Mar 2011 05:40:08 +0100, dsimcha  wrote:
> > On 3/18/2011 11:29 PM, Andrei Alexandrescu wrote:
> >> 1. Library proper:
> >>
> >> * "In the case of non-random access ranges, parallel foreach is still
> >> usable but buffers lazily to an array..." Wouldn't strided processing
> >> help? If e.g. 4 threads the first works on 0, 4, 8, ... second works on
> >> 1, 5, 9, ... and so on.
> >
> > You can have this if you want, by setting the work unit size to 1.
> > Setting it to a larger size just causes more elements to be buffered,
> > which may be more efficient in some cases.
> Please add an example showing that, too. Sure, the documentation says
> that's what's being done, but an example would show it more clearly.

I don't understand how this can be demonstrated in an example.  It's an
under-the-hood thing.  The only place this appears in the API is in the
workUnitSize parameter.

Re: Has the ban on returning function nested structs been lifted?

2011-03-19 Thread Simen kjaeraas


On Sat, 19 Mar 2011 13:05:59 +0100, spir  wrote:

I guess something similar should be the base design of ranges. "Range of  
X" could simply mean "lazy sequence of X", an on-demand array (lol); and  
that would be the return type of every function returning a range. The  
complexity (of filter-ing, map-ping, find-ind) could be hidden inside  
the object, not exposed in the outer type.


Such a scheme precludes the usage of structs as ranges, though. It would
require virtual functions.


--
Simen

Re: a cabal for D ?

2011-03-19 Thread Lutger Blijdestijn

Russel Winder wrote:

> On Thu, 2011-03-17 at 20:44 +, Jason E. Aten wrote:
>> Please correct me if I'm wrong, but I observe that there doesn't appear
>> to be a package management system / standard repository for D libraries.
>> Or is there?
>> 
>> I'm talking about something as easy to use as R's CRAN,
>> > install.packages("rforest")
>> 
>> or cpan for perl, ctan for latex, dpgk/apt for debian, cabal for Haskell/
>> Hackage, etc.
> 
> Note that every language-specific package manager conflicts directly
> with every operating system package manager.  Thus RubyGems, CPAN,
> Cabal, Maven, Go, etc. conflicts with the package management of Debian,
> Fedora, SUSE, FreeBSD, MacPorts, etc. leading to pain.  Pain leads to
> anger.  Anger leads to hate.  Hate leads to suffering.
> 
>> If there's not a commonly utilized one currently, perhaps we could
>> "borrow" cabal, with a trivial port.  cabal is Haskell's package manager.
>> 
>> Not only does having a standard package install system facilitate
>> adoption, it greatly facilitates code sharing and library maturation.
> 
> At the expense of easy system administration.

Not necessarily, fedora has rpm packages of gems for example. 
 
> I guess the only up side of language specific package management is that
> it enables people whose operating systems are not package structured to
> do things sensibly.  Alternatively Windows users could switch to a
> sensible operating system ;-)
> 

It's also often easier to package libraries with system specifically 
designed to do so for a particular language. That, combined with a common 
repository, usually results in a much wider selection of apis than a native 
distribution offers.

Re: Has the ban on returning function nested structs been lifted?

2011-03-19 Thread spir


On 03/19/2011 10:27 AM, Simen kjaeraas wrote:

On Fri, 18 Mar 2011 23:48:53 +0100, bearophile  wrote:


Jonathan M Davis:


Actually, the coolest part about it IMHO is that it highlights the fact that
you
should be using auto with std.algorithm and _not_ care about the exact types of
the return types. Knowing the exact return type for those functions is
generally
unnecessary and is often scary anyway (especially with the functions which
return lazy ranges like map and until). Making the functions return auto and
completely hiding the return type pretty much forces the issue. There's still
likely to be some confusion for those new to D, but it makes the proper way to
use std.algorithm more obvious. I'd hate to deal with any code which used
std.algorithm without auto. That would get ugly _fast_.


auto variable inference is indeed almost necessary if you want to use lazy
functions as the ones in Phobos. But I have to say that those types are scary
because of the current design of those Phobos higher order functions. In
Haskell if you have an iterable and you perform a map on it using a function
that returns an int, you produce something like a [Int], that's a lazy list
of machine integers. This is a very simple type. If you perform another map
on that list, and the mapping function returns an int again, the type of the
whole result is [Int] still. The type you work with doesn't grow more and
more as with Phobos functions. Designers of C# LINQ have found a more complex
solution, they build a tree of lazy delegates...


And we can have something similar in D:

struct Range( T ) {
void delegate( ) popFrontDg;
bool delegate( ) emptyDg;
T delegate( ) frontDg;

this( R )( R range ) if ( isForwardRange!R && is( ElementType!R == T ) ) {
auto rng = range.save();
popFrontDg = ( ){ rng.popFront(); };
emptyDg = ( ){ return rng.empty; };
frontDg = ( ){ return rng.front; };
}

@property T front( ) {
return frontDg( );
}

@property bool empty( ) {
return emptyDg( );
}

void popFront( ) {
popFrontDg( );
}
}

Range!(ElementType!R) range( R )( R rng ) if ( isForwardRange!R ) {
return Range!(ElementType!R)( rng );
}


There are times when I've wanted something like this because I don't
know the resultant type of a bunch of range operations, but have to
save it in a struct or class.


I guess something similar should be the base design of ranges. "Range of X" 
could simply mean "lazy sequence of X", an on-demand array (lol); and that 
would be the return type of every function returning a range. The complexity 
(of filter-ing, map-ping, find-ind) could be hidden inside the object, not 
exposed in the outer type.


Denis
--
_
vita es estrany
spir.wikidot.com

Re: a cabal for D ?

2011-03-19 Thread Jacob Carlborg


On 2011-03-18 22:20, Jason E. Aten wrote:

On Fri, 18 Mar 2011 18:42:36 +, Russel Winder wrote:

I still think basing a D packaging system on Git to be the best
direction.


Basing package distribution on Git or hg could be a big win, and would
help establish a customary case for revision control which is one of the
things that make cabal work so well (they use darcs for everything). I
find these revision control systems ver fast and very easy to use.

The other thing that cabal standardizes is the make/build system.  I've
updated bud/build to compile under D2, with all the latest patches, but
I'm far from convinced that it should be a make system of choice.  I have
limited experience here, but a "D aware" build system would seem to be
highly preferable.

What are people's experiences with the various options for build systems
with D?


It's not very easy to make an incremental build system for D because of 
several reasons. Some are due to how the language works and some are due 
to how DMD works:


* DMD doesn't output all data in all the object files - This can perhaps 
be solved by compiling with the -lib switch


* When you change one D file you need to recompile ALL files that depend 
on the changed file. To compare with C/C++ which has source and header 
files you only need to recompile the source file if you change it


* DMD doesn't keep the fully qualified module name when naming object 
files resulting in foo.bar will conflict with bar.bar. Issue 3541.



To me, I like the design goals of Andreas Fredriksson's Tundra build
system (he wants speed of incremental of builds prioritized over all
else, which means utilizing multicores for builds as much as possible to
get the quickest build), because fast builds are critical for game
development, where D is very attractive.  Game projects compile tens of
thousands of files. Tundra is GPL and it would be easy to extend to
support D.

http://voodoo-slide.blogspot.com/2010/08/tundra-my-build-system.html
https://github.com/deplinenoise/tundra
https://github.com/deplinenoise/tundra/downloads



--
/Jacob Carlborg

Re: a cabal for D ?

2011-03-19 Thread Jacob Carlborg


On 2011-03-18 09:52, Russel Winder wrote:

On Thu, 2011-03-17 at 20:44 +, Jason E. Aten wrote:

Please correct me if I'm wrong, but I observe that there doesn't appear
to be a package management system / standard repository for D libraries.
Or is there?

I'm talking about something as easy to use as R's CRAN,

install.packages("rforest")


or cpan for perl, ctan for latex, dpgk/apt for debian, cabal for Haskell/
Hackage, etc.


Note that every language-specific package manager conflicts directly
with every operating system package manager.  Thus RubyGems, CPAN,
Cabal, Maven, Go, etc. conflicts with the package management of Debian,
Fedora, SUSE, FreeBSD, MacPorts, etc. leading to pain.  Pain leads to
anger.  Anger leads to hate.  Hate leads to suffering.


If there's not a commonly utilized one currently, perhaps we could
"borrow" cabal, with a trivial port.  cabal is Haskell's package manager.

Not only does having a standard package install system facilitate
adoption, it greatly facilitates code sharing and library maturation.


At the expense of easy system administration.

I guess the only up side of language specific package management is that
it enables people whose operating systems are not package structured to
do things sensibly.  Alternatively Windows users could switch to a
sensible operating system ;-)


Another advantage that as least RubyGems has, in the combination with 
RVM, is that you can have different gems/packages installed for 
different Ruby compilers.



Given that D has chosen to switch to Git for version control, doesn't
this imply that package management transported over DVCS is the way
forward.  Go has certainly taken this route.  It prioritizes Mercurial
but supports Bazaar and Git as well.





--
/Jacob Carlborg

Re: Dream package management system (Was: a cabal for D ?)

2011-03-19 Thread Jacob Carlborg


On 2011-03-17 23:44, Jason E. Aten wrote:

On 3/17/11 4:00 PM, Tomek Sowiński wrote:
Yes, we need it badly.
I think it's a good moment to start a discussion. First off, what
exactly do we want from a package management system?



On Thu, 17 Mar 2011 16:28:37 -0500, Andrei Alexandrescu wrote:
Yah, would be great. It would be awesome if an expert in e.g. apt would
join D and create the design of a package management system.


I would invite interested parties to review the cabal/cabal-install/
Hackage system documentation. It is described here.

http://www.haskell.org/haskellwiki/How_to_write_a_Haskell_program

Haskell has over 2000 contributed user libraries--largely because they
have such a nice, easy to use, and well documented package system.

Rather than expend much effort, in the tradition of lazy evaluation and
getting 80/20 the way there, I would prefer to just do a clone of an
already successful system such as cabal (or ?) and then take feedback to
based on actual usage with D.

Thoughts?  Comments welcome.

Jason


I would clone RubyGems.

--
/Jacob Carlborg

Re: a cabal for D ?

2011-03-19 Thread Jacob Carlborg


On 2011-03-17 22:47, Lutger Blijdestijn wrote:

Jason E. Aten wrote:


Please correct me if I'm wrong, but I observe that there doesn't appear
to be a package management system / standard repository for D libraries.
Or is there?

I'm talking about something as easy to use as R's CRAN,

install.packages("rforest")


or cpan for perl, ctan for latex, dpgk/apt for debian, cabal for Haskell/
Hackage, etc.

If there's not a commonly utilized one currently, perhaps we could
"borrow" cabal, with a trivial port.  cabal is Haskell's package manager.

Not only does having a standard package install system facilitate
adoption, it greatly facilitates code sharing and library maturation.


There used to be one called dsss (d shared software system). It was widely
used, I think some D1 libraries still use it but it hasn't been maintained
for years.


It's still working fine for building and installing local libraries with 
D1, it's the net capabilities that doesn't work.


--
/Jacob Carlborg

Re: Different types with auto

2011-03-19 Thread Bekenn


On 3/18/2011 12:59 PM, bearophile wrote:

http://d.puremagic.com/issues/show_bug.cgi?id=2656


Thank goodness that's under discussion.

Re: Why can't structs be derived from?

2011-03-19 Thread Bekenn


On 3/18/2011 7:09 AM, Nick Sabalausky wrote:

"typedef b a;" (or "typedef a = b;")


Regarding syntax, maybe:

typedef A : int;
typedef B : int;

...with semantics as follows:
A a = 5;// ok
B b = a;// error
int i = a;  // ok
a = i;  // error
a = cast(A)i;   // ok
b = cast(B)a;   // error
b = cast(B)cast(int)i;  // ok

Possibly instead of 'typedef' we should be using a non-C keyword.  Heck, 
even 'type' works:


type A : int;

...and is more consistent with existing type declarations (we use 
'class', 'struct', and 'enum', not 'classdef', 'structdef', and 'enumdef').


Not sure if typedef should work with aggregates; that might just get too 
confusing.


Bleh.  Now I'm /really/ off-topic...

Re: Has the ban on returning function nested structs been lifted?

2011-03-19 Thread Simen kjaeraas

On Sat, 19 Mar 2011 11:18:17 +0100, Jonathan M Davis   
wrote:



There are times when I've wanted something like this because I don't
know the resultant type of a bunch of range operations, but have to
save it in a struct or class.


typeof is your friend.


Only when there is a definite type. Consider:

struct foo {
Range!int rng;

this( int[] arr, bool b ) {
if ( b ) {
rng = arr;
} else {
rng = map!"a+b"( arr );
}
}
}


--
Simen

Re: Has the ban on returning function nested structs been lifted?

2011-03-19 Thread Jonathan M Davis

On Saturday 19 March 2011 02:27:46 Simen kjaeraas wrote:
> On Fri, 18 Mar 2011 23:48:53 +0100, bearophile 
> 
> wrote:
> > Jonathan M Davis:
> >> Actually, the coolest part about it IMHO is that it highlights the fact
> >> that you
> >> should be using auto with std.algorithm and _not_ care about the exact
> >> types of
> >> the return types. Knowing the exact return type for those functions is
> >> generally
> >> unnecessary and is often scary anyway (especially with the functions
> >> which
> >> return lazy ranges like map and until). Making the functions return
> >> auto and
> >> completely hiding the return type pretty much forces the issue. There's
> >> still
> >> likely to be some confusion for those new to D, but it makes the proper
> >> way to
> >> use std.algorithm more obvious. I'd hate to deal with any code which
> >> used
> >> std.algorithm without auto. That would get ugly _fast_.
> > 
> > auto variable inference is indeed almost necessary if you want to use
> > lazy functions as the ones in Phobos. But I have to say that those types
> > are scary because of the current design of those Phobos higher order
> > functions. In Haskell if you have an iterable and you perform a map on
> > it using a function that returns an int, you produce something like a
> > [Int], that's a lazy list of machine integers. This is a very simple
> > type. If you perform another map on that list, and the mapping function
> > returns an int again, the type of the whole result is [Int] still. The
> > type you work with doesn't grow more and more as with Phobos functions.
> > Designers of C# LINQ have found a more complex solution, they build a
> > tree of lazy delegates...
> 
> And we can have something similar in D:
> 
> struct Range( T ) {
>  void delegate( ) popFrontDg;
>  bool delegate( ) emptyDg;
>  T delegate( ) frontDg;
> 
>  this( R )( R range ) if ( isForwardRange!R && is( ElementType!R == T )
> ) {
>  auto rng = range.save();
>  popFrontDg = ( ){ rng.popFront(); };
>  emptyDg= ( ){ return rng.empty; };
>  frontDg= ( ){ return rng.front; };
>  }
> 
>  @property T front( ) {
>  return frontDg( );
>  }
> 
>  @property bool empty( ) {
>  return emptyDg( );
>  }
> 
>  void popFront( ) {
>  popFrontDg( );
>  }
> }
> 
> Range!(ElementType!R) range( R )( R rng ) if ( isForwardRange!R ) {
>  return Range!(ElementType!R)( rng );
> }
> 
> 
> There are times when I've wanted something like this because I don't
> know the resultant type of a bunch of range operations, but have to
> save it in a struct or class.

typeof is your friend.

- Jonathan M Davis

Re: Why can't structs be derived from?

2011-03-19 Thread Don


Simen kjaeraas wrote:

On Fri, 18 Mar 2011 15:09:23 +0100, Nick Sabalausky  wrote:


"Bekenn"  wrote in message
news:ilv2pd$1vkd$1...@digitalmars.com...

On 3/17/2011 2:36 PM, Andrei Alexandrescu wrote:


I'm with y'all too. Even Walter needs to stop and think for a second.
We're considering enabling

alias a = b;

as an equivalent for

alias b a;



Along similar lines (hoping this isn't too far off-topic), what's the
current plan for typedef?  I'm aware that it's deprecated (and for good
reason), but some of my reading suggests that there's a successor on the
horizon.


I was thinking of asking about that, too. Specifically, would it make 
sence

for "typedef b a;" (or "typedef a = b;") to be lowered to something like:

struct a
{
b _tmp;
alias _tmp this;
}

Hmm, then again, IIUC, that would allow 'a' to be implicity converted 
to 'b'

which would defeat half the point, so maybe not.


Yeah. Typedef is too blunt an instrument for our purposes. What we want is:

alias Subtype!int SubInt;
alias Supertype!int SupInt;
alias Standalone!int NaturalNumber;

Where the following work:

int a = SubInt(3);
SupInt b = 3;
NaturalNumber c = NaturalNumber(3);

and the following do not:

SubInt d = 3;
int e = SupInt(3);
NaturalNumber f = 3;
int g = NaturalNumber(3);

And of course:

alias Subtype!int SubInt2;
alias Supertype!int SupInt2;
alias Standalone!int NaturalNumber2;

Where these do not work:

SubInt2 h = SubInt(3);
SupInt2 i = SupInt(3);
NaturalNumber2 j = NaturalNumber(3);


I think the classic use case for typedef is Windows handles.
HMENU menu;
HWND window;
HANDLE h = menu; // OK
h = window; // OK
menu = window; // should not compile.

My feeling is, that the built-in typedef is just not precise enough to 
be much use.

Re: Has the ban on returning function nested structs been lifted?

2011-03-19 Thread Simen kjaeraas

On Fri, 18 Mar 2011 23:48:53 +0100, bearophile   
wrote:



Jonathan M Davis:

Actually, the coolest part about it IMHO is that it highlights the fact  
that you
should be using auto with std.algorithm and _not_ care about the exact  
types of
the return types. Knowing the exact return type for those functions is  
generally
unnecessary and is often scary anyway (especially with the functions  
which
return lazy ranges like map and until). Making the functions return  
auto and
completely hiding the return type pretty much forces the issue. There's  
still
likely to be some confusion for those new to D, but it makes the proper  
way to
use std.algorithm more obvious. I'd hate to deal with any code which  
used

std.algorithm without auto. That would get ugly _fast_.


auto variable inference is indeed almost necessary if you want to use  
lazy functions as the ones in Phobos. But I have to say that those types  
are scary because of the current design of those Phobos higher order  
functions. In Haskell if you have an iterable and you perform a map on  
it using a function that returns an int, you produce something like a  
[Int], that's a lazy list of machine integers. This is a very simple  
type. If you perform another map on that list, and the mapping function  
returns an int again, the type of the whole result is [Int] still. The  
type you work with doesn't grow more and more as with Phobos functions.  
Designers of C# LINQ have found a more complex solution, they build a  
tree of lazy delegates...


And we can have something similar in D:

struct Range( T ) {
void delegate( ) popFrontDg;
bool delegate( ) emptyDg;
T delegate( ) frontDg;

this( R )( R range ) if ( isForwardRange!R && is( ElementType!R == T )  
) {

auto rng = range.save();
popFrontDg = ( ){ rng.popFront(); };
emptyDg= ( ){ return rng.empty; };
frontDg= ( ){ return rng.front; };
}

@property T front( ) {
return frontDg( );
}

@property bool empty( ) {
return emptyDg( );
}

void popFront( ) {
popFrontDg( );
}
}

Range!(ElementType!R) range( R )( R rng ) if ( isForwardRange!R ) {
return Range!(ElementType!R)( rng );
}


There are times when I've wanted something like this because I don't
know the resultant type of a bunch of range operations, but have to
save it in a struct or class.

--
Simen

Re: review of std.parallelism

2011-03-19 Thread Simen kjaeraas


On Sat, 19 Mar 2011 05:40:08 +0100, dsimcha  wrote:


On 3/18/2011 11:29 PM, Andrei Alexandrescu wrote:

1. Library proper:

* "In the case of non-random access ranges, parallel foreach is still
usable but buffers lazily to an array..." Wouldn't strided processing
help? If e.g. 4 threads the first works on 0, 4, 8, ... second works on
1, 5, 9, ... and so on.


You can have this if you want, by setting the work unit size to 1.  
Setting it to a larger size just causes more elements to be buffered,  
which may be more efficient in some cases.


Please add an example showing that, too. Sure, the documentation says
that's what's being done, but an example would show it more clearly.


--
Simen

Re: review of std.parallelism

2011-03-19 Thread dsimcha

Ok, thanks again for clarifying **how** the docs could be improved. 
I've implemented the suggestions and generally given the docs a good 
reading over and clean up.  The new docs are at:


http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html

On 3/18/2011 11:29 PM, Andrei Alexandrescu wrote:

0. Overview and vote

I think the library delivers the high-level goods (parallel foreach,
map, reduce) but is a bit fuzzy in the lower level details.

Documentation hurts understanding of the capabilities of the library and
essentially is of inadequate quality. Entity documentation and examples
do little more than give the impression of dispensing with an unpleasant
chore.

My vote in favor of acceptance is contingent upon a radical improvement
in the quality of documentation and examples. Most if not all artifacts
should be motivated by simple, compelling examples. The introductory
section must contain a brief and attractive synopsis of the flagship
capabilities. All terms must be defined before being used and introduced
in a carefully-chosen order. The relationship between various entities
should be clarified.

I've seen the argument that simple and strong examples are difficult to
find. Though I agree such examples are not easy to come up with, I also
believe the author of library is in the best position to produce them.



1. Library proper:

* "In the case of non-random access ranges, parallel foreach is still
usable but buffers lazily to an array..." Wouldn't strided processing
help? If e.g. 4 threads the first works on 0, 4, 8, ... second works on
1, 5, 9, ... and so on.

* Example with squares would be better if double replaced uint, and if a
more complicated operation (sqrt, log...) were involved.

* I'm unclear on the tactics used by lazyMap. I'm thinking the obvious
method should be better: just use one circular buffer. The presence of
two dependent parameters makes this abstraction difficult to operate with.

* Same question about asyncBuf. What is wrong with a circular buffer
filled on one side by threads and on the consumed from the other by the
client? I can think of a couple of answers but it would be great if they
were part of the documentation.

* Why not make workerIndex a ulong and be done with it?

* Is stop() really trusted or just unsafe? If it's forcibly killing
threads then its unsafe.

* uint size() should be size_t for conformity.

* defaultPoolThreads - should it be a @property?

* I don't understand what Task is supposed to do. It is circularly
defined: "A struct that encapsulates the information about a task,
including its current status, what pool it was submitted to, and its
arguments." OK, but what's a task? Could a task be used outside a task
pool, and if so to what effect(s)?

* If LazyMap is only necessary as the result of lazyMap, could that
become a local type inside lazyMap?


2. Documentation:

* Documentation unacceptable. It lacks precision and uses terms before
defining them. For example: "This class encapsulates a task queue and a
set of worker threads" comes before the notions of "task queue" and
"worker thread" are defined. Clearly there is an intuition of what those
are supposed to mean, but in practice each library lays down some fairly
detailed definition of such.

* "Note: Initializing a pool with zero threads (as would happen in the
case of a single-core CPU) is well-tested and does work." The absence of
a bug should not be advertised in so many words. Simpler: "Note:
Single-CPU machines will operate transparently with zero-sized pools."

* "Allows for custom pool size." I have no idea what pool size means.

* "// Do something interesting." Worst example ever. You are the creator
of the library. If _you_ can't find a brief compelling example, who could?

* "Immediately after the range argument, an optional block size argument
may be provided. If none is provided, the default block size is used. An
optional buffer for returining the results may be provided as the last
argument. If one is not provided, one will be automatically allocated.
If one is provided, it must be the same length as the range." An example
should be inserted after each of these sentences.

* "// Do something expensive with line." Better you do something
expensive with line.

* "This is not the same thing as commutativity." I think this is obvious
enough to be left out. The example of matrices (associative but not
commutative) is nice though.

* "immutable n = 10;" -> use underscores

* Would be great if the documentation examples included some rough
experimental results, e.g. "3.8 times faster on a 4-core machine than
the equivalent serial loop".

* No example for workerIndex and why it's useful.

* Is LocalWorker better than WorkerLocal? No, because it's not the
worker that's local, it's the storage - which is missing from the name!
WorkerLocalStorage? Also replace "create" with "make" or drop it
entirely. The example doesn't tell me how I can use bufs. I suspect
work

80 matches

Mail list logo