Re: [fpc-devel] C++ gets language-internal concurrency support

2011-08-22 Thread Michael Schnell

On 08/20/2011 12:36 PM, Alexander Klenin wrote:

Basically, no.
Here is a quite recent "call for action" in this regard:
http://software.intel.com/en-us/blogs/2011/08/09/parallelism-as-a-first-class-citizen-in-c-and-c-the-time-has-come/

I thought C++11 already would be more advanced on that behalf ;) .
-Michael
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] C++ gets language-internal concurrency support

2011-08-22 Thread Michael Schnell

On 08/19/2011 01:53 PM, David W Noon wrote:


The 2011 C++ standard does not, but GCC and a few other compilers offer
a facility called OpenMP that parallelises loops; it works for C, C++
and FORTRAN, at least within GCC.
I do know about OpenMP and I seem to remember that there is an article 
about same in the FPC Wiki.


A C++ extension might use some syntax-candy (like "parallel") to make 
OpenMP (hidden in a library) easily usable.


std::async(), std::future and std::promise seems to be provide some 
"sntax candy" for making threads more usable.


here in the FAQ they write:
"The *packaged_task* type is provided to simplify launching a thread to 
execute a task. In particular, it takes care of setting up a *future* 
connected to a *promise* and to provides the wrapper code to put the 
return value or exception from the task into the *promise*."


-Michael
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] C++ gets language-internal concurrency support

2011-08-20 Thread Alexander Klenin
On Fri, Aug 19, 2011 at 21:33, Michael Schnell  wrote:
> Does C++11 also provide a more "automatic" parallel processing feature than
> just something similar to worker threads (Object Pascal TThread), maybe
> similar to Prism's "parallel loop" ?

Basically, no.
Here is a quite recent "call for action" in this regard:
http://software.intel.com/en-us/blogs/2011/08/09/parallelism-as-a-first-class-citizen-in-c-and-c-the-time-has-come/

-- 
Alexander S. Klenin
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] C++ gets language-internal concurrency support

2011-08-19 Thread Michael Schnell

On 08/19/2011 02:15 PM, David W Noon wrote:

I might do some experiments in C# to see if the thread manager creates
threads with process scope or system scope, as it might be a bit
smarter than Java's "green threads".
I remember that there once was a version of the PThreadLib that used an 
internal userland scheduler for handling the threads. Here of course 
only one thread is visible to the OS and thus only one CPU is used. So 
the "green Thread" Model is available in C (and FPC), as well. I suppose 
this library is not much in use any more.


-Michael
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] C++ gets language-internal concurrency support

2011-08-19 Thread Michael Schnell

On 08/19/2011 02:15 PM, David W Noon wrote:

Threads are not tied to processors.
I do know that in Linux the scheduler tries to keep a thread in a core 
if possible (even after re-scheduling the thread after preemption). This 
greatly increases cache performance.

This is especially true in
"managed" languages, where threads usually have process scope and
consequently all run on the same CPU. [I.e. a thread with process scope
cannot have CPU affinity distinct from that of the main thread in the
process, whereas a thread with system scope can run wherever the
operating system's CPU dispatcher sends it, possibly constrained by
that thread's own CPU affinity mask.]
While I do understand what you mean, I never heard that defining CPU 
affinity in a way that a process only is allowed to use one core for all 
it's threads. If this is set, of course parallel tasks don't make any 
sense at all. I suppose that this can be set for special purposes, but I 
don't thinks that it's a standard setting (e.g. in Linux).


As a growing "class" of programs are defined to take advantage from a 
multi-core System - and I suppose most do this by distributing 
cycle-hungry tasks towards multiple threads - I understand that the 
normal way of the OS is to allow for multiple CPUs for a single 
application.


-Michael
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] C++ gets language-internal concurrency support

2011-08-19 Thread Michael Schnell

On 08/19/2011 02:15 PM, David W Noon wrote:

My experience with OpenMP is that it is difficult to write a loop body
large enough that context switching does not overwhelm the benefits of
parallelism.


Hmmm.

If you do a multiplication of a 100*100 Matrix you could spawn 1 
threads and this will result in a huge switching overhead.


But if you have 10 cores and you aggregate the 1 tasks in 10 groups 
of 1000 calculations each, spawn 10 threads and have each go through a 
loop of calculating 1000 cells, I gather that (in a perfect world) no 
task switching overhead at all would be necessary (but at the beginning 
and the end of the complete calculation).


If in Prism you do something like (pseudo-code draft):

m := 100;
n :=  100

for parallel ij := 0 to m*n-1 do begin
  i := ij mod m;
  j := ij div m;
 calccell (i,j);
end;


I understand that Prism (or rather .NET) on a 10 core machine 
automatically would create 10 threads each doing 1000 cells.


-Michael
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] C++ gets language-internal concurrency support

2011-08-19 Thread Michael Schnell

On 08/19/2011 02:15 PM, David W Noon wrote:

That is not my experience.  While CIL byte code interprets much faster
than Java byte code, it is still discernibly slower than native object
code.
Hmm. I don't have any personal experience with this, but from what I 
read, this idea with CIL is to split the normal compile procedure that does
(1) code analysis, high level optimizing and creating an intermediate 
code that is independent of the original language and the target CPU 
and

(2) low level optimization and target code generation

and perform the second step on the target system when loading.

So it does not _need_ do run slower (it _needs_ to load slower, though). 
But of course a really good "full" compiler might do a better optimizing 
job.


Regarding code snippets that are worth to be done in parallel (e.g. 
calculating a cell in a Matrix Multiplication) should not be highly 
prone to speed degradation by CIL.



[Or perhaps my C# is not that good.]

I doubt that this is the case :) .

-Michael
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] C++ gets language-internal concurrency support

2011-08-19 Thread Jonas Maebe


On 19 Aug 2011, at 14:15, David W Noon wrote:


I might do some experiments in C# to see if the thread manager creates
threads with process scope or system scope, as it might be a bit
smarter than Java's "green threads".


According to wikipedia, Java stopped using the green threads model  
after JDK 1.1. I'm pretty sure that all Java threads are native  
threads on today's implementations (at least on desktop and server  
platforms).



Jonas
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] C++ gets language-internal concurrency support

2011-08-19 Thread David W Noon
On Fri, 19 Aug 2011 12:30:43 +0200, Michael Schnell wrote about Re:
[fpc-devel] C++ gets language-internal concurrency support:

>On 08/17/2011 06:49 PM, David W Noon wrote:
>> Perhaps the slower execution speed of CIL (.NET,
>> Mono) byte code masks the context switching overheads and makes this
>> practice look less inefficient.
>
>I doubt that this is the case. AFAIK, CIL code is not necessarily much 
>slower than native code. (Of course it can be much slower, especially 
>when garbage collection is necessary.)

That is not my experience.  While CIL byte code interprets much faster
than Java byte code, it is still discernibly slower than native object
code. [Or perhaps my C# is not that good.]

>But when using a parallel loop "decently" i.e. for doing not too short 
>unrelated calculations with only so many parallel threads as
>processors are available, this is supposed to grant a good speedup.

My experience with OpenMP is that it is difficult to write a loop body
large enough that context switching does not overwhelm the benefits of
parallelism.

>AFAIK with prism the parallel loop automatically is broken into as
>many parallel threads as available processors, thus avoiding
>additional context switches.

Threads are not tied to processors.  This is especially true in
"managed" languages, where threads usually have process scope and
consequently all run on the same CPU. [I.e. a thread with process scope
cannot have CPU affinity distinct from that of the main thread in the
process, whereas a thread with system scope can run wherever the
operating system's CPU dispatcher sends it, possibly constrained by
that thread's own CPU affinity mask.]

The reason the "managed" languages typically use process scoped threads
is that the threads are dispatched by the language's thread manager, not
the operating system.  This gives the thread manager total control over
the threads, but it does mean that each thread is simply "called" in its
turn and so runs on the same CPU as the thread manager; any request
that would block the thread forces a return to the thread manager,
which gives another thread a run, etc.

I might do some experiments in C# to see if the thread manager creates
threads with process scope or system scope, as it might be a bit
smarter than Java's "green threads".
-- 
Regards,

Dave  [RLU #314465]
===
david.w.n...@ntlworld.com (David W Noon)
===


signature.asc
Description: PGP signature
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] C++ gets language-internal concurrency support

2011-08-19 Thread Michael Schnell

On 08/17/2011 06:03 PM, David W Noon wrote:


I am (reasonably so, at least).


Where is the parallel aspect?

As I said I just did a quick search in the FAQ.

Does C++11 also provide a more "automatic" parallel processing feature 
than just something similar to worker threads (Object Pascal TThread), 
maybe similar to Prism's "parallel loop" ?


-Michael
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] C++ gets language-internal concurrency support

2011-08-19 Thread Michael Schnell

On 08/17/2011 06:49 PM, David W Noon wrote:

Perhaps the slower execution speed of CIL (.NET,
Mono) byte code masks the context switching overheads and makes this
practice look less inefficient.


I doubt that this is the case. AFAIK, CIL code is not necessarily much 
slower than native code. (Of course it can be much slower, especially 
when garbage collection is necessary.)


But when using a parallel loop "decently" i.e. for doing not too short 
unrelated calculations with only so many parallel threads as processors 
are available, this is supposed to grant a good speedup. AFAIK with 
prism the parallel loop automatically is broken into as many parallel 
threads as available processors, thus avoiding additional context switches.


-Michael
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] C++ gets language-internal concurrency support

2011-08-17 Thread David W Noon
On Wed, 17 Aug 2011 18:14:03 +0200 (CEST), Marco van de Voort wrote
about Re: [fpc-devel] C++ gets language-internal concurrency support:

> In our previous episode, David W Noon said:
> 
> > The threads t1 and t2 execute in parallel.  Moreover, they will
> > execute in parallel with any code that occurs between the
> > declaration that start the threads and the join() method calls that
> > synchronize them with the invoking thread.  On a SMP system they
> > will execute physically in parallel, not simply timesliced against
> > one another.  The underlying implementation model is that of POSIX
> > threads.
> 
> I know, but this is an explicit form of parallelism, and spawns one
> thread, not much different from Delphi tthread. (specially as coupled
> with anonymous methods in later versions)
> 
> The .NET/Prism "parallel for" however spawns multiple
> threads, one for each "for" iteration, probably with some maximum. 

This is like OpenMP and its parallelisation of FORTRAN DO-loops, which
can also be used for C/C++ for-loops.  It is quite a separate concept
from that of the std::thread class (and its vendor-supplied
predecessors).

> Note that I'm not so sure that the "parallel for" is a good (read:
> practical) thing to have, it is just that I noted some discrepancy in
> M. Schnell's original post where he tied the new C++ features to the
> Prism functionality.

Mu FORTRAN experience of it is that it is poor practice.  The
granularity of the workload is too fine for the context switching
overheads of threads.  Perhaps the slower execution speed of CIL (.NET,
Mono) byte code masks the context switching overheads and makes this
practice look less inefficient.
-- 
Regards,

Dave  [RLU #314465]
*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
david.w.n...@ntlworld.com (David W Noon)
*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*


signature.asc
Description: PGP signature
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] C++ gets language-internal concurrency support

2011-08-17 Thread Marco van de Voort
In our previous episode, David W Noon said:

> The threads t1 and t2 execute in parallel.  Moreover, they will execute
> in parallel with any code that occurs between the declaration that
> start the threads and the join() method calls that synchronize them
> with the invoking thread.  On a SMP system they will execute physically
> in parallel, not simply timesliced against one another.  The underlying
> implementation model is that of POSIX threads.

I know, but this is an explicit form of parallelism, and spawns one thread,
not much different from Delphi tthread. (specially as coupled with anonymous
methods in later versions)

The .NET/Prism "parallel for" however spawns multiple
threads, one for each "for" iteration, probably with some maximum. 

Note that I'm not so sure that the "parallel for" is a good (read:
practical) thing to have, it is just that I noted some discrepancy in M. 
Schnell's original post where he tied the new C++ features to the Prism
functionality.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] C++ gets language-internal concurrency support

2011-08-17 Thread David W Noon
On Wed, 17 Aug 2011 16:24:35 +0200 (CEST), Marco van de Voort wrote
about Re: [fpc-devel] C++ gets language-internal concurrency support:

> In our previous episode, Michael Schnell said:
[snip]
>> int main()
>> {
>> std::thread t1{std::bind(f,some_vec)};
>> //*f(some_vec) executes in separate thread*
>> std::thread t2{F(some_vec)};
>>//*F(some_vec)()executes in separate thread*
>>
>> t1.join();
>> t2.join();
>> }
> 
> I'm no C++ expert, but:

I am (reasonably so, at least).

> Where is the parallel aspect?

The threads t1 and t2 execute in parallel.  Moreover, they will execute
in parallel with any code that occurs between the declaration that
start the threads and the join() method calls that synchronize them
with the invoking thread.  On a SMP system they will execute physically
in parallel, not simply timesliced against one another.  The underlying
implementation model is that of POSIX threads.

This is tantamount to the TThread class (e.g. from Borland C++) becoming
part of the C++ Standard Template Library (STL).  It is nothing
particularly new or special, but at least the C++ standard now
acknowledges concurrent execution and its need for reentrancy.
-- 
Regards,

Dave  [RLU #314465]
*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
david.w.n...@ntlworld.com (David W Noon)
*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*


signature.asc
Description: PGP signature
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] C++ gets language-internal concurrency support

2011-08-17 Thread Michael Schnell

On 08/17/2011 04:24 PM, Marco van de Voort wrote:


I'm no C++ expert, but:

Where is the parallel aspect? It looks more like a shorthand to spawn a
thread to evaluate an expression/closure/function call, and then wait on it
using .join().
Same here. I just did a short search in the FAQ ( 
http://www2.research.att.com/~bs/C++0xFAQ.html ) to get a glimpse.

I'm note sure we are actually witnessing something like the parallel for you
talked about earlier?  Are you sure?
Maybe this was in the Lazarus list. But I suppose here it would be more 
appropriate (if at all).


-Michael
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] C++ gets language-internal concurrency support

2011-08-17 Thread Marco van de Voort
In our previous episode, Michael Schnell said:
> Some c++11 code doing parallel execution:
> 
> *
>   void f(vector&);
> 
>   struct F {
>   vector&  v;
>   F(vector&  vv) :v{vv} { }
>   void operator()();
>   };
> 
>   int main()
>   {
>   std::thread t1{std::bind(f,some_vec)};  //*f(some_vec) executes 
> in separate thread*
>   std::thread t2{F(some_vec)};//*F(some_vec)() 
> executes in separate thread*
> 
>   t1.join();
>   t2.join();
>   }

I'm no C++ expert, but:

Where is the parallel aspect? It looks more like a shorthand to spawn a
thread to evaluate an expression/closure/function call, and then wait on it
using .join().  

The "closure"-like aspect (be able to pass expressions to be evaluated
somewhere else) looks like a bigger feature then that threads are now
grouped under namespace std.

I'm note sure we are actually witnessing something like the parallel for you
talked about earlier?  Are you sure?
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] C++ gets language-internal concurrency support

2011-08-17 Thread Michael Schnell

Some c++11 code doing parallel execution:

*
void f(vector&);

struct F {
vector&  v;
F(vector&  vv) :v{vv} { }
void operator()();
};

int main()
{
std::thread t1{std::bind(f,some_vec)};  //*f(some_vec) executes 
in separate thread*
std::thread t2{F(some_vec)};//*F(some_vec)() 
executes in separate thread*

t1.join();
t2.join();
}

*

*-Michael*

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


[fpc-devel] C++ gets language-internal concurrency support

2011-08-17 Thread Michael Schnell

http://www.linuxfordevices.com/c/a/News/C11-standard-approved/

Prism already does have "parallel loops" and "future variable" for that 
purpose (but of course usable only with a .NET/Mono framework, as the 
implementation is done therein)


I remember discussions about providing something on that behalf in fpc 
for native code..


-Michael
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel