[Caml-list] Shared memory parallel application: kernel threads

2010-03-12 Thread Hugo Ferreira

Hello,

I need to implement (meta) heuristic algorithms that
uses parallelism in order to (attempt to) solve a (hard)
machine learning problem that is inherently exponential.
The aim is to take maximum advantage of the multi-core
processors I have access to.

To that effect I have revisited many of the lively
discussions in threads related to concurrency, parallelism
and shared memory in this mailing list. I however still
have many doubts, some of which are very basic.

My initial objective is to make a very simple tests that
launches k jobs. Each of these jobs must access
a common data set that is read-only. Each of the k threads
in turn generates its own data. The data generated by the k
jobs are then placed in a queue for further processing.

The process continues by launching (or reusing) k/2 jobs.
Each job consumes two elements from the queue that where
previously generated (the common data set must still be
available). The process repeats itself until k=1. Note
that the queued data is not small nor can I determine
a fixed maximum size for it.

I have opted to use "kernel-level threads" that allow use
of the (multi-core) processors but still allow "easy"
access to shared memory".

I have done a cursory look at:
- Ocaml.Threads
- Ocaml.Unix (LinuxThreads)
- coThreads
- Ocamlnet2/3 (netshm, netcamlbox)
(An eThreads library exists in the forge but I did not examine this)

My first concern is to take advantage of the multi-cores so:

1. The thread library is not the answer
   Chapter 24 - "The threads library is implemented by time-sharing on 
a

   single processor. It will not take advantage of multi-processor
   machines." [1]

2. LinuxThreads seems to be what I need
   "The main strength of this approach is that it can take full
advantage of multiprocessors." [2]


Issue 1

In the manual [3] I see only references to function for the creation
and  use of processes. I see no calls that allow me to simply generate
and assign a function (job) to a thread (such as val create : ('a -> 'b)
 -> 'a -> t in the Thread module). The unix library where LinuxThreads
is now integrated shows the same API. Am I missing something or
is their no way to launch "threaded functions" from the Unix module?
Naturally I assume that threads and processes are not the same thing.

Issue 2

If I cannot launch kernel-threads to allow for easy memory sharing, what
other options do I have besides netshm? The data I must share is defined
by a recursive variant and is not simple numerical data.

I would appreciate any comments.

TIA,
Hugo F.


[1] http://caml.inria.fr/pub/docs/manual-ocaml/manual038.html
[2] http://pauillac.inria.fr/~xleroy/linuxthreads/
[3] http://caml.inria.fr/pub/docs/manual-ocaml/libref/ThreadUnix.html
[4] http://caml.inria.fr/pub/docs/manual-ocaml/manual035.html



___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


Re: [Caml-list] Shared memory parallel application: kernel threads

2010-03-12 Thread Gerd Stolpmann
On Fr, 2010-03-12 at 11:55 +, Hugo Ferreira wrote:
> Hello,
> 
> I need to implement (meta) heuristic algorithms that
> uses parallelism in order to (attempt to) solve a (hard)
> machine learning problem that is inherently exponential.
> The aim is to take maximum advantage of the multi-core
> processors I have access to.
> 
> To that effect I have revisited many of the lively
> discussions in threads related to concurrency, parallelism
> and shared memory in this mailing list. I however still
> have many doubts, some of which are very basic.
> 
> My initial objective is to make a very simple tests that
> launches k jobs. Each of these jobs must access
> a common data set that is read-only. Each of the k threads
> in turn generates its own data. The data generated by the k
> jobs are then placed in a queue for further processing.
> 
> The process continues by launching (or reusing) k/2 jobs.
> Each job consumes two elements from the queue that where
> previously generated (the common data set must still be
> available). The process repeats itself until k=1. Note
> that the queued data is not small nor can I determine
> a fixed maximum size for it.
> 
> I have opted to use "kernel-level threads" that allow use
> of the (multi-core) processors but still allow "easy"
> access to shared memory".
> 
> I have done a cursory look at:
> - Ocaml.Threads
> - Ocaml.Unix (LinuxThreads)
> - coThreads
> - Ocamlnet2/3 (netshm, netcamlbox)
> (An eThreads library exists in the forge but I did not examine this)
> 
> My first concern is to take advantage of the multi-cores so:
> 
> 1. The thread library is not the answer
> Chapter 24 - "The threads library is implemented by time-sharing on 
> a
> single processor. It will not take advantage of multi-processor
> machines." [1]
> 
> 2. LinuxThreads seems to be what I need
> "The main strength of this approach is that it can take full
>  advantage of multiprocessors." [2]

I think you mix here several things up. LinuxThreads has nothing to do
with ocaml. It is an implementation of kernel threads for Linux on the C
level. It is considered as outdated as of today, and is usually replaced
by a better implementation (NPTL) that conforms more strictly to the
POSIX standard.

Ocaml uses for its multi-threading implementation the multi-threading
API the OS provides. This might be LinuxThreads or NPTL or something
else. So, on the lower half of the implementation the threads are kernel
threads, and multi-core-enabled. However, Ocaml prevents that more than
one of the kernel threads can run inside its runtime at any time. So
Ocaml code will always run only on one core (but you can call C code,
and this can then take full advantage of multi-cores).

This is the primary reason I am going with multi-processing in my
projects, and why Ocamlnet focuses on it.

The Netcamlbox module of Ocamlnet 3 might be interesting for you. Here
is an example program that mass-multiplies matrices on several cores:

https://godirepo.camlcity.org/svn/lib-ocamlnet2/trunk/code/examples/camlbox/manymult.ml

Netcamlbox can move complex values to shared memory, so you are not
restricted to bigarrays. The matrix example uses float array array as
representation. Recursive variants should also be fine.

For providing shared data to all workers, you can simply load it into
the master process before the children processes are forked off. Another
option is (especially when it is a lot of data, and you cannot afford to
have n copies) to create another camlbox in the master process before
forking, and to copy the shared data into it before forking. This avoids
that the data is copied at fork time.

One drawback of Netcamlbox is that it is unsafe, and violating the
programming rules is punished with crashes. (But this also applies, to
some extent, to multi-threading, only that the rules are different.)

Gerd

> 
> Issue 1
> 
> In the manual [3] I see only references to function for the creation
> and  use of processes. I see no calls that allow me to simply generate
> and assign a function (job) to a thread (such as val create : ('a -> 'b)
>   -> 'a -> t in the Thread module). The unix library where LinuxThreads
> is now integrated shows the same API. Am I missing something or
> is their no way to launch "threaded functions" from the Unix module?
> Naturally I assume that threads and processes are not the same thing.
> 
> Issue 2
> 
> If I cannot launch kernel-threads to allow for easy memory sharing, what
> other options do I have besides netshm? The data I must share is defined
> by a recursive variant and is not simple numerical data.
> 
> I would appreciate any comments.
> 
> TIA,
> Hugo F.
> 
> 
> [1] http://caml.inria.fr/pub/docs/manual-ocaml/manual038.html
> [2] http://pauillac.inria.fr/~xleroy/linuxthreads/
> [3] http://caml.inria.fr/pub/docs/manual-ocaml/libref/ThreadUnix.html
> [4] http://caml.inria.fr/pub/docs/manual-ocaml/manual035.html
> 
> 
> 
> 

Re: [Caml-list] [ANN] CCSS 1.0

2010-03-12 Thread Daniel Bünzli
> which I posit should rarely be an issue, precisely because
> they live in different contexts.

Example, a box whose height is a picture followed by three lines of text.

> But anyway, I've implemented unit conversion, which you can find in SVN [1].
> In the next release this feature may be activated at user discretion.

Great, thanks.

Daniel

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


Re: [Caml-list] Shared memory parallel application: kernel threads

2010-03-12 Thread Hugo Ferreira

Hi,

Gerd Stolpmann wrote:

On Fr, 2010-03-12 at 11:55 +, Hugo Ferreira wrote:

Hello,

I need to implement (meta) heuristic algorithms that
uses parallelism in order to (attempt to) solve a (hard)
machine learning problem that is inherently exponential.
The aim is to take maximum advantage of the multi-core
processors I have access to.


snip

My first concern is to take advantage of the multi-cores so:

1. The thread library is not the answer
Chapter 24 - "The threads library is implemented by time-sharing on 
a

single processor. It will not take advantage of multi-processor
machines." [1]

2. LinuxThreads seems to be what I need
"The main strength of this approach is that it can take full
 advantage of multiprocessors." [2]


I think you mix here several things up. LinuxThreads has nothing to do
with ocaml. It is an implementation of kernel threads for Linux on the C
level. It is considered as outdated as of today, and is usually replaced
by a better implementation (NPTL) that conforms more strictly to the
POSIX standard.



Oops. Silly me.


Ocaml uses for its multi-threading implementation the multi-threading
API the OS provides. This might be LinuxThreads or NPTL or something
else. So, on the lower half of the implementation the threads are kernel
threads, and multi-core-enabled. 


Ok.Should have read more carefully. As stated in the manual "Two
implementations of the threads library are available, depending on the
capabilities of the operating system:" So I have a recent glibc and
therefore "multi-core-enabled" threads.


However, Ocaml prevents that more than
one of the kernel threads can run inside its runtime at any time. So
Ocaml code will always run only on one core (but you can call C code,
and this can then take full advantage of multi-cores).



Ok. I was under the (wrong) impression that the native OS threads did
run simultaneously (multi-core) but were intermittently stopped due to
the GC. So threads won't help.


This is the primary reason I am going with multi-processing in my
projects, and why Ocamlnet focuses on it.



Understood.


The Netcamlbox module of Ocamlnet 3 might be interesting for you. Here
is an example program that mass-multiplies matrices on several cores:

https://godirepo.camlcity.org/svn/lib-ocamlnet2/trunk/code/examples/camlbox/manymult.ml

Netcamlbox can move complex values to shared memory, so you are not
restricted to bigarrays. The matrix example uses float array array as
representation. Recursive variants should also be fine.

For providing shared data to all workers, you can simply load it into
the master process before the children processes are forked off. Another
option is (especially when it is a lot of data, and you cannot afford to
have n copies) to create another camlbox in the master process before
forking, and to copy the shared data into it before forking. This avoids
that the data is copied at fork time.



The main data set is large, so I will opt for the latter.


One drawback of Netcamlbox is that it is unsafe, and violating the
programming rules is punished with crashes. (But this also applies, to
some extent, to multi-threading, only that the rules are different.)



Not an issue for me.
Going to read-up on and install ocamlnet3.

Thanks,
Hugo F.



Gerd


Issue 1

In the manual [3] I see only references to function for the creation
and  use of processes. I see no calls that allow me to simply generate
and assign a function (job) to a thread (such as val create : ('a -> 'b)
  -> 'a -> t in the Thread module). The unix library where LinuxThreads
is now integrated shows the same API. Am I missing something or
is their no way to launch "threaded functions" from the Unix module?
Naturally I assume that threads and processes are not the same thing.

Issue 2

If I cannot launch kernel-threads to allow for easy memory sharing, what
other options do I have besides netshm? The data I must share is defined
by a recursive variant and is not simple numerical data.

I would appreciate any comments.

TIA,
Hugo F.


[1] http://caml.inria.fr/pub/docs/manual-ocaml/manual038.html
[2] http://pauillac.inria.fr/~xleroy/linuxthreads/
[3] http://caml.inria.fr/pub/docs/manual-ocaml/libref/ThreadUnix.html
[4] http://caml.inria.fr/pub/docs/manual-ocaml/manual035.html



___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs






___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


[Caml-list] Re: Shared memory parallel application: kernel threads

2010-03-12 Thread Sylvain Le Gall
On 12-03-2010, Hugo Ferreira  wrote:
> Hello,
>
> I have opted to use "kernel-level threads" that allow use
> of the (multi-core) processors but still allow "easy"
> access to shared memory".
>
> I have done a cursory look at:
> - Ocaml.Threads
> - Ocaml.Unix (LinuxThreads)
> - coThreads
> - Ocamlnet2/3 (netshm, netcamlbox)
> (An eThreads library exists in the forge but I did not examine this)
>

I think you should also have a look at ocaml/mpi for communication:
http://forge.ocamlcore.org/projects/ocamlmpi/
and ancient for accessing read-only memory:
http://merjis.com/developers/ancient

MPI can work on a single computer to take advantage of multi-core
through multi-processus.

Regards,
Sylvain Le Gall

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


Re: [Caml-list] Re: Shared memory parallel application: kernel threads

2010-03-12 Thread Hugo Ferreira

Sylvain Le Gall wrote:

On 12-03-2010, Hugo Ferreira  wrote:

Hello,

I have opted to use "kernel-level threads" that allow use
of the (multi-core) processors but still allow "easy"
access to shared memory".

I have done a cursory look at:
- Ocaml.Threads
- Ocaml.Unix (LinuxThreads)
- coThreads
- Ocamlnet2/3 (netshm, netcamlbox)
(An eThreads library exists in the forge but I did not examine this)



I think you should also have a look at ocaml/mpi for communication:
http://forge.ocamlcore.org/projects/ocamlmpi/
and ancient for accessing read-only memory:
http://merjis.com/developers/ancient

MPI can work on a single computer to take advantage of multi-core
through multi-processus.



Indeed. I did not list these because I was specifically looking for
a share memory solution amongst threads. Seeing as I am forced to use
processes ancient is worth considering.

Thanks,
Hugo F.


Regards,
Sylvain Le Gall

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


[Caml-list] Strangeness with atexit and exception backtraces

2010-03-12 Thread Michael Ekstrand
I have been using the Bolt logging library[1] lately in my code, but
have encountered a difficulty with debugging programs.  Bolt uses an
atexit handler to close all open log files when the program shuts down.
 However, if an uncought exception is encountered with this atexit
handler in place, the program terminates with status 2 but neither the
exception value nor its backtrace (with OCAMLRUNPARAM=b) are printed.

If I disable the atexit handler, the error information is printed as I
expect.

I am currently working around this by modifying Bolt so that the atexit
handler is disabled if OCaml starts up with stack traces enabled.  I am
wondering, though, if this is a known bug (or limitation)?  Is there
another workaround, or a fix on the horizon?  Or should I go file the
appropriate report in the OCaml bugtracker?

Thanks,
- Michael

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


Re: [Caml-list] [ANN] CCSS 1.0

2010-03-12 Thread Dario Teixeira
Hi,

In the meantime I've released version 1.1, which includes the requested
unit conversion feature.  It works as I described in a previous email,
but is not activated by default.  If you really need this feature, simply
provide option '--convert' (short version '-c') upon command line invocation.

For more information: http://ccss.forge.ocamlcore.org/

Hope you find it useful!
Best regards,
Dario Teixeira





___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


[Caml-list] Last CfP: PPDP'10

2010-03-12 Thread Temur Kutsia


==
 Call for Papers
PPDP 2010
   12th International ACM SIGPLAN Symposium on
 Principles and Practice of Declarative Programming
 Hagenberg, Austria, 26-28 July 2010
(co-located with LOPSTR 2010)
 http://www.risc.uni-linz.ac.at/conferences/ppdp2010/
==

PPDP 2010 aims to bring together researchers from the declarative
programming communities, including those working in the logic,
constraint and functional programming paradigms, but also embracing a
variety of other paradigms such as visual programming, executable
specification languages, database languages, AI languages and
knowledge representation languages used, for example, in the semantic
web. The goal is to stimulate research in the use of logical
formalisms and methods for specifying, performing, and analysing
computations, including mechanisms for mobility, modularity,
concurrency, object-orientation, security, and static analysis. Papers
related to the use of declarative paradigms and tools in industry and
education are especially solicited.

The conference will take place in July 2010 in the Castle of
Hagenberg, Austria, colocated with the 20th International Symposium on
Logic-Based Program Synthesis and Transformation (LOPSTR 2010),
organised by the Research Institute for Symbolic Computation (RISC) of
the Johannes Kepler University Linz.

Topics:
* Logic, Constraint, and Functional Programming
* Database, AI and Knowledge Representation Languages
* Visual Programming
* Executable Specification Languages
* Applications of Declarative Programming
* Methodologies: Program Design and Development
* Declarative Aspects of Object-Oriented Programming
* Concurrent Extensions to Declarative Languages
* Declarative Mobile Computing
* Integration of Paradigms
* Proof Theoretic and Semantic Foundations
* Type and Module Systems
* Program Analysis and Verification
* Program Transformation
* Abstract Machines and Compilation
* Programming Environments
The list above is not exhaustive - submissions describing  new and
interesting ideas relating broadly to declarative programming are
encouraged.

Submission guidelines:
Papers should be submitted via the Easychair submission website
for PPDP 2010:
http://www.easychair.org/conferences/?conf=ppdp2010
Papers should consist of the equivalent of 12 pages under the
ACM formatting guidelines. These guidelines are available online,
along with formatting templates or style files.
Submitted papers will be judged on the basis of significance,
relevance, correctness, originality, and clarity. They should include
a clear identification of what has been accomplished and why it is
significant. They must describe original, previously unpublished work
that has not been simultaneously submitted for publication
elsewhere. Authors who wish to provide additional material to the
reviewers beyond the 12-page limit can do so in clearly marked
appendices: reviewers are not required to read such appendices.
No simultaneous submission to other publication outlets (either a
conference or a journal) is allowed.


Proceedings:
The proceedings will be published by ACM Press. Authors of accepted
papers will be required to sign a copyright form.  Camera ready papers
for accepted papers should be prepared and submitted according to the
final instructions that will be sent by the publisher after
notification of acceptance.


Invited Speakers:
Maria Paola Bonacina (Università degli Studi di Verona, Italy)
Sumit Gulwani (Microsoft Research)


Important Dates:
# Submission: title and abstract: 15 March 2010
  full paper: 21 March 2010
# Notification:  23 April 2010
# Final version: 12 May 2010
# Symposium: 26-28 July 2010

Programme Committee:
Elvira Albert (Spain)
Sergio Antoy (US)
Frederic Blanqui (China)
Michele Bugliesi (Italy)
Giuseppe Castagna (France)
Mariangiola Dezani (Italy)
Francois Fages (France)
Maribel Fernandez (UK), chair
Joxan Jaffar (Singapore)
Andy King (UK)
Temur Kutsia (Austria)
Francisco Lopez Fraguas (Spain)
Ian Mackie (France)
Henrik Nilsson (UK)
Albert Rubio (Spain)
Kazunori Ueda (Japan)
Philip Wadler (UK)

Symposium Chairs:
Temur Kutsia and Wolfgang Schreiner (Austria)

For more information, please contact the chairs:
Maribel Fernandez
King's College London, UK
Email: maribel.fernan...@kcl.ac.uk

Temur Kutsia and Wolfgang Schreiner
Research Institute for Symbolic Computation
Johannes Kepler University Linz
Email: kut...@risc.uni-linz.ac.at


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: ht

[Caml-list] camelia on windows config problem

2010-03-12 Thread Michael Hicks
I've just been playing with using camelia on Windows, since some students in my 
class are using it.

My problem is that I can't figure out how to configure Camelia to run a 
non-Cygwin Ocaml.  My installation puts the executables in C:\Program 
Files\Objective Caml\bin, and I can redirect the configuration to look here.  
But when Camelia tries to run the ocaml toplevel, it complains that it cannot 
find pervasives.cmi.  This is true when I also configure the libraries to be in 
C:\Program Files\Objective Caml\lib\.  When I run ocaml from the terminal, it 
runs fine (i.e., it's able to find the .cmi file).

Any camelia users with ideas about what is going on?  Thanks in advance!

-Mike

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


Re: [Caml-list] Re: Shared memory parallel application: kernel threads

2010-03-12 Thread Philippe Wang
Hi,

If your program doesn't need usage-proved stability, you may be
interested in the "OCaml for Multicore" project which provides an
alternative runtime library (prototype quality) which allows threads
to compute in parallel.
http://www.algo-prog.info/ocmc/

If you choose to give it a try, we would enjoy your feedbacks.

Cheers,

-- 
Philippe Wang
   m...@philippewang.info

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs