Re: [fpc-devel] C++ gets language-internal concurrency support
On 08/20/2011 12:36 PM, Alexander Klenin wrote: Basically, no. Here is a quite recent "call for action" in this regard: http://software.intel.com/en-us/blogs/2011/08/09/parallelism-as-a-first-class-citizen-in-c-and-c-the-time-has-come/ I thought C++11 already would be more advanced on that behalf ;) . -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] C++ gets language-internal concurrency support
On 08/19/2011 01:53 PM, David W Noon wrote: The 2011 C++ standard does not, but GCC and a few other compilers offer a facility called OpenMP that parallelises loops; it works for C, C++ and FORTRAN, at least within GCC. I do know about OpenMP and I seem to remember that there is an article about same in the FPC Wiki. A C++ extension might use some syntax-candy (like "parallel") to make OpenMP (hidden in a library) easily usable. std::async(), std::future and std::promise seems to be provide some "sntax candy" for making threads more usable. here in the FAQ they write: "The *packaged_task* type is provided to simplify launching a thread to execute a task. In particular, it takes care of setting up a *future* connected to a *promise* and to provides the wrapper code to put the return value or exception from the task into the *promise*." -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] C++ gets language-internal concurrency support
On Fri, Aug 19, 2011 at 21:33, Michael Schnell wrote: > Does C++11 also provide a more "automatic" parallel processing feature than > just something similar to worker threads (Object Pascal TThread), maybe > similar to Prism's "parallel loop" ? Basically, no. Here is a quite recent "call for action" in this regard: http://software.intel.com/en-us/blogs/2011/08/09/parallelism-as-a-first-class-citizen-in-c-and-c-the-time-has-come/ -- Alexander S. Klenin ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] C++ gets language-internal concurrency support
On 08/19/2011 02:15 PM, David W Noon wrote: I might do some experiments in C# to see if the thread manager creates threads with process scope or system scope, as it might be a bit smarter than Java's "green threads". I remember that there once was a version of the PThreadLib that used an internal userland scheduler for handling the threads. Here of course only one thread is visible to the OS and thus only one CPU is used. So the "green Thread" Model is available in C (and FPC), as well. I suppose this library is not much in use any more. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] C++ gets language-internal concurrency support
On 08/19/2011 02:15 PM, David W Noon wrote: Threads are not tied to processors. I do know that in Linux the scheduler tries to keep a thread in a core if possible (even after re-scheduling the thread after preemption). This greatly increases cache performance. This is especially true in "managed" languages, where threads usually have process scope and consequently all run on the same CPU. [I.e. a thread with process scope cannot have CPU affinity distinct from that of the main thread in the process, whereas a thread with system scope can run wherever the operating system's CPU dispatcher sends it, possibly constrained by that thread's own CPU affinity mask.] While I do understand what you mean, I never heard that defining CPU affinity in a way that a process only is allowed to use one core for all it's threads. If this is set, of course parallel tasks don't make any sense at all. I suppose that this can be set for special purposes, but I don't thinks that it's a standard setting (e.g. in Linux). As a growing "class" of programs are defined to take advantage from a multi-core System - and I suppose most do this by distributing cycle-hungry tasks towards multiple threads - I understand that the normal way of the OS is to allow for multiple CPUs for a single application. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] C++ gets language-internal concurrency support
On 08/19/2011 02:15 PM, David W Noon wrote: My experience with OpenMP is that it is difficult to write a loop body large enough that context switching does not overwhelm the benefits of parallelism. Hmmm. If you do a multiplication of a 100*100 Matrix you could spawn 1 threads and this will result in a huge switching overhead. But if you have 10 cores and you aggregate the 1 tasks in 10 groups of 1000 calculations each, spawn 10 threads and have each go through a loop of calculating 1000 cells, I gather that (in a perfect world) no task switching overhead at all would be necessary (but at the beginning and the end of the complete calculation). If in Prism you do something like (pseudo-code draft): m := 100; n := 100 for parallel ij := 0 to m*n-1 do begin i := ij mod m; j := ij div m; calccell (i,j); end; I understand that Prism (or rather .NET) on a 10 core machine automatically would create 10 threads each doing 1000 cells. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] C++ gets language-internal concurrency support
On 08/19/2011 02:15 PM, David W Noon wrote: That is not my experience. While CIL byte code interprets much faster than Java byte code, it is still discernibly slower than native object code. Hmm. I don't have any personal experience with this, but from what I read, this idea with CIL is to split the normal compile procedure that does (1) code analysis, high level optimizing and creating an intermediate code that is independent of the original language and the target CPU and (2) low level optimization and target code generation and perform the second step on the target system when loading. So it does not _need_ do run slower (it _needs_ to load slower, though). But of course a really good "full" compiler might do a better optimizing job. Regarding code snippets that are worth to be done in parallel (e.g. calculating a cell in a Matrix Multiplication) should not be highly prone to speed degradation by CIL. [Or perhaps my C# is not that good.] I doubt that this is the case :) . -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] C++ gets language-internal concurrency support
On 19 Aug 2011, at 14:15, David W Noon wrote: I might do some experiments in C# to see if the thread manager creates threads with process scope or system scope, as it might be a bit smarter than Java's "green threads". According to wikipedia, Java stopped using the green threads model after JDK 1.1. I'm pretty sure that all Java threads are native threads on today's implementations (at least on desktop and server platforms). Jonas ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] C++ gets language-internal concurrency support
On Fri, 19 Aug 2011 12:30:43 +0200, Michael Schnell wrote about Re: [fpc-devel] C++ gets language-internal concurrency support: >On 08/17/2011 06:49 PM, David W Noon wrote: >> Perhaps the slower execution speed of CIL (.NET, >> Mono) byte code masks the context switching overheads and makes this >> practice look less inefficient. > >I doubt that this is the case. AFAIK, CIL code is not necessarily much >slower than native code. (Of course it can be much slower, especially >when garbage collection is necessary.) That is not my experience. While CIL byte code interprets much faster than Java byte code, it is still discernibly slower than native object code. [Or perhaps my C# is not that good.] >But when using a parallel loop "decently" i.e. for doing not too short >unrelated calculations with only so many parallel threads as >processors are available, this is supposed to grant a good speedup. My experience with OpenMP is that it is difficult to write a loop body large enough that context switching does not overwhelm the benefits of parallelism. >AFAIK with prism the parallel loop automatically is broken into as >many parallel threads as available processors, thus avoiding >additional context switches. Threads are not tied to processors. This is especially true in "managed" languages, where threads usually have process scope and consequently all run on the same CPU. [I.e. a thread with process scope cannot have CPU affinity distinct from that of the main thread in the process, whereas a thread with system scope can run wherever the operating system's CPU dispatcher sends it, possibly constrained by that thread's own CPU affinity mask.] The reason the "managed" languages typically use process scoped threads is that the threads are dispatched by the language's thread manager, not the operating system. This gives the thread manager total control over the threads, but it does mean that each thread is simply "called" in its turn and so runs on the same CPU as the thread manager; any request that would block the thread forces a return to the thread manager, which gives another thread a run, etc. I might do some experiments in C# to see if the thread manager creates threads with process scope or system scope, as it might be a bit smarter than Java's "green threads". -- Regards, Dave [RLU #314465] === david.w.n...@ntlworld.com (David W Noon) === signature.asc Description: PGP signature ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] C++ gets language-internal concurrency support
On 08/17/2011 06:03 PM, David W Noon wrote: I am (reasonably so, at least). Where is the parallel aspect? As I said I just did a quick search in the FAQ. Does C++11 also provide a more "automatic" parallel processing feature than just something similar to worker threads (Object Pascal TThread), maybe similar to Prism's "parallel loop" ? -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] C++ gets language-internal concurrency support
On 08/17/2011 06:49 PM, David W Noon wrote: Perhaps the slower execution speed of CIL (.NET, Mono) byte code masks the context switching overheads and makes this practice look less inefficient. I doubt that this is the case. AFAIK, CIL code is not necessarily much slower than native code. (Of course it can be much slower, especially when garbage collection is necessary.) But when using a parallel loop "decently" i.e. for doing not too short unrelated calculations with only so many parallel threads as processors are available, this is supposed to grant a good speedup. AFAIK with prism the parallel loop automatically is broken into as many parallel threads as available processors, thus avoiding additional context switches. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] C++ gets language-internal concurrency support
On Wed, 17 Aug 2011 18:14:03 +0200 (CEST), Marco van de Voort wrote about Re: [fpc-devel] C++ gets language-internal concurrency support: > In our previous episode, David W Noon said: > > > The threads t1 and t2 execute in parallel. Moreover, they will > > execute in parallel with any code that occurs between the > > declaration that start the threads and the join() method calls that > > synchronize them with the invoking thread. On a SMP system they > > will execute physically in parallel, not simply timesliced against > > one another. The underlying implementation model is that of POSIX > > threads. > > I know, but this is an explicit form of parallelism, and spawns one > thread, not much different from Delphi tthread. (specially as coupled > with anonymous methods in later versions) > > The .NET/Prism "parallel for" however spawns multiple > threads, one for each "for" iteration, probably with some maximum. This is like OpenMP and its parallelisation of FORTRAN DO-loops, which can also be used for C/C++ for-loops. It is quite a separate concept from that of the std::thread class (and its vendor-supplied predecessors). > Note that I'm not so sure that the "parallel for" is a good (read: > practical) thing to have, it is just that I noted some discrepancy in > M. Schnell's original post where he tied the new C++ features to the > Prism functionality. Mu FORTRAN experience of it is that it is poor practice. The granularity of the workload is too fine for the context switching overheads of threads. Perhaps the slower execution speed of CIL (.NET, Mono) byte code masks the context switching overheads and makes this practice look less inefficient. -- Regards, Dave [RLU #314465] *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* david.w.n...@ntlworld.com (David W Noon) *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* signature.asc Description: PGP signature ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] C++ gets language-internal concurrency support
In our previous episode, David W Noon said: > The threads t1 and t2 execute in parallel. Moreover, they will execute > in parallel with any code that occurs between the declaration that > start the threads and the join() method calls that synchronize them > with the invoking thread. On a SMP system they will execute physically > in parallel, not simply timesliced against one another. The underlying > implementation model is that of POSIX threads. I know, but this is an explicit form of parallelism, and spawns one thread, not much different from Delphi tthread. (specially as coupled with anonymous methods in later versions) The .NET/Prism "parallel for" however spawns multiple threads, one for each "for" iteration, probably with some maximum. Note that I'm not so sure that the "parallel for" is a good (read: practical) thing to have, it is just that I noted some discrepancy in M. Schnell's original post where he tied the new C++ features to the Prism functionality. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] C++ gets language-internal concurrency support
On Wed, 17 Aug 2011 16:24:35 +0200 (CEST), Marco van de Voort wrote about Re: [fpc-devel] C++ gets language-internal concurrency support: > In our previous episode, Michael Schnell said: [snip] >> int main() >> { >> std::thread t1{std::bind(f,some_vec)}; >> //*f(some_vec) executes in separate thread* >> std::thread t2{F(some_vec)}; >>//*F(some_vec)()executes in separate thread* >> >> t1.join(); >> t2.join(); >> } > > I'm no C++ expert, but: I am (reasonably so, at least). > Where is the parallel aspect? The threads t1 and t2 execute in parallel. Moreover, they will execute in parallel with any code that occurs between the declaration that start the threads and the join() method calls that synchronize them with the invoking thread. On a SMP system they will execute physically in parallel, not simply timesliced against one another. The underlying implementation model is that of POSIX threads. This is tantamount to the TThread class (e.g. from Borland C++) becoming part of the C++ Standard Template Library (STL). It is nothing particularly new or special, but at least the C++ standard now acknowledges concurrent execution and its need for reentrancy. -- Regards, Dave [RLU #314465] *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* david.w.n...@ntlworld.com (David W Noon) *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* signature.asc Description: PGP signature ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] C++ gets language-internal concurrency support
On 08/17/2011 04:24 PM, Marco van de Voort wrote: I'm no C++ expert, but: Where is the parallel aspect? It looks more like a shorthand to spawn a thread to evaluate an expression/closure/function call, and then wait on it using .join(). Same here. I just did a short search in the FAQ ( http://www2.research.att.com/~bs/C++0xFAQ.html ) to get a glimpse. I'm note sure we are actually witnessing something like the parallel for you talked about earlier? Are you sure? Maybe this was in the Lazarus list. But I suppose here it would be more appropriate (if at all). -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] C++ gets language-internal concurrency support
In our previous episode, Michael Schnell said: > Some c++11 code doing parallel execution: > > * > void f(vector&); > > struct F { > vector& v; > F(vector& vv) :v{vv} { } > void operator()(); > }; > > int main() > { > std::thread t1{std::bind(f,some_vec)}; //*f(some_vec) executes > in separate thread* > std::thread t2{F(some_vec)};//*F(some_vec)() > executes in separate thread* > > t1.join(); > t2.join(); > } I'm no C++ expert, but: Where is the parallel aspect? It looks more like a shorthand to spawn a thread to evaluate an expression/closure/function call, and then wait on it using .join(). The "closure"-like aspect (be able to pass expressions to be evaluated somewhere else) looks like a bigger feature then that threads are now grouped under namespace std. I'm note sure we are actually witnessing something like the parallel for you talked about earlier? Are you sure? ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] C++ gets language-internal concurrency support
Some c++11 code doing parallel execution: * void f(vector&); struct F { vector& v; F(vector& vv) :v{vv} { } void operator()(); }; int main() { std::thread t1{std::bind(f,some_vec)}; //*f(some_vec) executes in separate thread* std::thread t2{F(some_vec)};//*F(some_vec)() executes in separate thread* t1.join(); t2.join(); } * *-Michael* ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
[fpc-devel] C++ gets language-internal concurrency support
http://www.linuxfordevices.com/c/a/News/C11-standard-approved/ Prism already does have "parallel loops" and "future variable" for that purpose (but of course usable only with a .NET/Mono framework, as the implementation is done therein) I remember discussions about providing something on that behalf in fpc for native code.. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel