Re: [Caml-list] Average cost of the OCaml GC

2010-11-16 Thread Goswin von Brederlow
Jianzhou Zhao jianz...@seas.upenn.edu writes:

 On Thu, Nov 11, 2010 at 3:11 PM, Goswin von Brederlow goswin-...@web.de 
 wrote:
 Jianzhou Zhao jianz...@seas.upenn.edu writes:

 On Thu, Nov 11, 2010 at 4:08 AM, Goswin von Brederlow goswin-...@web.de 
 wrote:
 Jianzhou Zhao jianz...@seas.upenn.edu writes:

 Hi,

 What is the average cost of the OCaml GC? I have a program that calls
 'mark_slice' in 57% of the total execution time, and calls
 'sweep_slice' in 21% of the total time, reported by Callgrind, which
 is a profiling tool in Valgrind. 57% and 21% are the 'self cost' ---
 the cost of the function itself ('Self Cost'), rather than the cost
 including all called functions ('Inclusive Cost'). I guess
 'mark_slice'  and  'sweep_slice'  are functions from OCaml GC. Are
 these numbers normal?

 Those numbers sound rather high to me.

 My program calls both OCaml and C, which passes around C data types in
 between. I also doubt if I defined the interface in an 'unefficient'
 way that slows down the GC. Are there any rules in mind to make GC
 work more efficiently?

 You can tune some of the GC parameters to suit your use case.

 Do you allocate custom types from C? In caml_alloc_custom(ops, size,
 used, max) the used and max do influence the GC how often to run.

 Yes. The code uses caml_alloc_custom to create a lot of small objects
 (less then 8 bytes) frequently. The used and max are set to be
 default, 0 and 1. The manual says
   http://caml.inria.fr/pub/docs/manual-ocaml/manual032.html#toc140

 /
 If your finalized blocks contain no pointers to out-of-heap resources,
 or if the previous discussion made little sense to you, just take used
 = 0 and max = 1. But if you later find that the finalization functions
 are not called “often enough”, consider increasing the used / max
 ratio.
 //

 Does this mean the default used and max let GC do finalization 'as
 slow as possible'? This does not seem to be the case if the costs 57%
 and 20% are too high.

 I think 0/1 gives you the least amount of GC runs.

 If you set them wrong you might trigger the GC too often.

 In which case could they be set 'wrong'? For example, if 'used' is not
 equal to the real amount of allocated data; or is there a range of
 'max' given a used?

 A used = 100 would be wrong here. Your 0/1 setting look fine to me.

 Do we still have other methods to debug such problems? Is it possible
 to know when and where GC runs, say, the number of times GC works
 after a particular usr-defined function? If this is possible, I was
 wondering if we can see which function in my code behave wrong.

Only the interface the GC module exposes. You can turn the GC quite
verbose.

MfG
Goswin

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


[Caml-list] Option functions (or lack thereof) + operator for composition

2010-11-16 Thread Serge Le Huitouze
Hi,

While writing a sample program (with lablgtk2), I found a few things annoying
and thought I would ask here what you guys think.


1. Option type

It seems that there is no predefined function to test an 'a option for being
specifically None or Some _. This seems to be confirmed by the very
existence of:
http://ocaml-lib.sourceforge.net/doc/Option.html
which defines such functions (is_none and is_some).
I found it weird to be forced to use match expressions in my code for
doing that, e.g.:
*  let curSelectedRow = ref None in
*  let updateButtonsStatus () =
*  button_remove#misc#set_sensitive
*  (match !curSelectedRow with None - false | _ - true)
*  in
*  ...

I could add the OCaml library mentioned above, but I don't know how to do
it (and where to find it) and, since my code is supposed to go into some other
code, I'd prefer avoiding adding yet another dependency to it...


2. Operator for composition (and its precedence)

To get rid of many warnings, I wrapped some calls (the connect calls of
my widgets) into ignore (f x y) statements.
I've no particular grief in using ignore, but I find the parentheses
*really* annoying.

In Haskell, I would write ignore $ f x y, which I find much lighter weight.

I'm not familiar with operators and their precedence, but I wonder: is it
possible to do something similar with OCaml?


Thanks for reading.

Best regards.

--Serge

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


Re: [Caml-list] Option functions (or lack thereof) + operator for composition

2010-11-16 Thread Thomas Gazagnaire
Hi,

 I found it weird to be forced to use match expressions in my code
for
 doing that, e.g.:
 *  let curSelectedRow = ref None in
 *  let updateButtonsStatus () =
 *  button_remove#misc#set_sensitive
 *  (match !curSelectedRow with None - false | _ - true)
 *  in
 *  ...
 

You are not forced to use match expression, you can just define :
let is_none x = match x with None - true | Some _ - false
and is_some x = not (is_none x)
and then use these functions in your code ...

 In Haskell, I would write ignore $ f x y, which I find much lighter
weight.

Just define:
let ($) f x = f x
and that should do the trick

Thomas

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


Re: [Caml-list] Option functions (or lack thereof) + operator for composition

2010-11-16 Thread Gabriel Kerneis
On Tue, Nov 16, 2010 at 12:43:30PM +0100, Thomas Gazagnaire wrote:
 You are not forced to use match expression, you can just define :
 let is_none x = match x with None - true | Some _ - false
 and is_some x = not (is_none x)
 and then use these functions in your code ...

Or even simpler:
  let is_none = (=) None
  let is_some = (!=) None
Note that you can use this directly, for instance:
  List.filter ((!=) None) l

Best,
-- 
Gabriel

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


[Caml-list] Re: SMP multithreading

2010-11-16 Thread Sylvain Le Gall
Hi,

On 15-11-2010, Wolfgang Draxinger wdraxinger.maill...@draxit.de wrote:
 Hi,

 I've just read
 http://caml.inria.fr/pub/ml-archives/caml-list/2002/11/64c14acb90cb14bedb2cacb73338fb15.en.html
 in particular this paragraph:
| What about hyperthreading?  Well, I believe it's the last convulsive
| movement of SMP's corpse :-)  We'll see how it goes market-wise.  At
| any rate, the speedups announced for hyperthreading in the Pentium 4
| are below a factor of 1.5; probably not enough to offset the overhead
| of making the OCaml runtime system thread-safe.

 This reads just like the 640k ought be enough for everyone. Multicore
 systems are the standard today. Even the cheapest consumer machines
 come with at least two cores. Once can easily get 6 core machines today.

 Still thinking SMP was a niche and was dying?


Hyperthreading was never remarkable about performance or whatever and is
probably not pure SMP (emulated SMP maybe?).

 So, what're the developments regarding SMP multithreading OCaml?


There are various development regarding this subject (most recent
first):
- Plasma (MapReduce in OCaml)
  http://plasma.camlcity.org/plasma/index.html
- OC4MC (OCaml for MultiCore)
  http://www.algo-prog.info/ocmc/web/
- ocamlp3l
  http://camlp3l.inria.fr/eng.htm
- jocaml 
  http://jocaml.inria.fr/
- ocamlmpi
  http://forge.ocamlcore.org/projects/ocamlmpi/

All these projects try to tackle the challenge of SMP from different
point of view. Maybe you'll find what your answer in one of them.

Regards,
Sylvain Le Gall

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


Re: [Caml-list] Option functions (or lack thereof) + operator for composition

2010-11-16 Thread Jacques Garrigue
On 2010/11/16, at 20:27, Serge Le Huitouze wrote:
 1. Option type
 
 It seems that there is no predefined function to test an 'a option for being
 specifically None or Some _.

In ocaml you can compare at any type.
So you can just write
  (!curSelectedRow  None)
which explains why there is no such function in the standard library.

 2. Operator for composition (and its precedence)
 
 To get rid of many warnings, I wrapped some calls (the connect calls of
 my widgets) into ignore (f x y) statements.
 I've no particular grief in using ignore, but I find the parentheses
 *really* annoying.
 
 In Haskell, I would write ignore $ f x y, which I find much lighter weight.
 
 I'm not familiar with operators and their precedence, but I wonder: is it
 possible to do something similar with OCaml?

You can, but unfortunately $ does not have the right associativity.

# let ($) f x = f x;;
val ( $ ) : ('a - 'b) - 'a - 'b = fun
# ignore $ 1+1;;
- : unit = ()
# succ $ succ 3;;
- : int = 5
# succ $ succ $ succ 3;;
Error: This expression has type int - int
   but an expression was expected of type int

As you can see, it works fine for one argument, but doesn't work if you nest 
them.
(Actually, this is a question of point of view, since you can use it
in place of parentheses to sequence multiple arguments)

If you want the compositionality, you can use something starting with @:

# let (@@) f x = f x;;
val ( @@ ) : ('a - 'b) - 'a - 'b = fun
# succ @@ succ @@ succ 3;;
- : int = 6

Note however that there is a simpler way to circumvent the problem:
use the -w s option on the command line, disabling the statement warning.
All my lablgtk code uses this flag :-)

Jacques Garrigue
___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


Re: [Caml-list] Option functions (or lack thereof) + operator for composition

2010-11-16 Thread Serge Le Huitouze
 To get rid of many warnings, I wrapped some calls (the connect calls of
 my widgets) into ignore (f x y) statements.

 You can, but unfortunately $ does not have the right associativity.

 # let ($) f x = f x;;
 val ( $ ) : ('a - 'b) - 'a - 'b = fun
 # ignore $ 1+1;;
 - : unit = ()
 # succ $ succ 3;;
 - : int = 5
 # succ $ succ $ succ 3;;

 If you want the compositionality, you can use something starting with @:

 # let (@@) f x = f x;;
 val ( @@ ) : ('a - 'b) - 'a - 'b = fun
 # succ @@ succ @@ succ 3;;
 - : int = 6


Wow, tricky!
Not as tricky as Perl naming scheme for variables, but still tricky! ;-)
So I have to learn the precedence and associaticity table in section 6.7 of
OCaml's reference manual!!!


 Note however that there is a simpler way to circumvent the problem:
 use the -w s option on the command line, disabling the statement warning.
 All my lablgtk code uses this flag :-)

Warnings should be gotten rid of, so I don't like the idea of making
them silent.
But restricting silence to the statement warning is probably OK.
And I'll take your words for it, since, obviously, you have more experience
with lablgtk than myself! ;-)

But, on the other hand, maybe you can harmlessly use -w s, whereas it may
bite me, given my poor understanding of this stuff...


Thanks.

--Serge

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


Re: [Caml-list] Option functions (or lack thereof) + operator for composition

2010-11-16 Thread dmitry grebeniuk
Hello.

 So I have to learn the precedence and associaticity table in section 6.7 of
 OCaml's reference manual!!!

  I usually redefine () operator: let (  ) f x = f x
  It's ok until you don't use JoCaml which defines keyword .
  Also take a look at other infix operators I use, maybe you found them
useful too: http://bitbucket.org/gds/ops/src/tip/ops.ml (comments are
in russian and in OCaml, I'll translate russian text to english at first
request, it's not hard for me, but it seems that OCaml code in comments
describes the usage well enough).

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


Re: [Caml-list] Option functions (or lack thereof) + operator for composition

2010-11-16 Thread bluestorm
On Tue, Nov 16, 2010 at 3:26 PM, Michael Ekstrand mich...@elehack.netwrote:

 Batteries provides operators for things like this.  It defines the '**'
 operator for function application; it's an odd name, but it has the
 right associativity.  As Dmitry mentioned, some override ().  Batteries
 also provides composition operators |- and -|, and a pipeline operator
 | (opposite of **).  With that operator, you can write:

f x y | ignore

 thereby putting the emphasis on f x y and relegating ignore to a
 cleanup at the end.

 - Michael


(|) as inverse of (|) is also available. It doesn't have the right
associativity, but you can easily use (f -| g -| h | x) instead of (f ** g
** h ** x).

Though I find the application-as-pipeline style quite readable in some
cases, I think that in general it is more often superfluous than not.
Besides, as mentioned recently on this list, overuse of the function
composition operators (|-) and (-|) are also call for troubles with the
value restriction.
All in all, I think it's reasonable to stay conservative and not advertise
funky binary operators too loudly.

That said, domain-specific binary operators are certainly useful for
readability in some contexts --- that's what an infix operator is anyway :
an unreadable-by-design symbol that only get meaning by domain-specific
conventions. Local open in, available by standard since OCaml 3.12, allow us
to neatly encapsulate such domain-specific notations into OCaml modules.
___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


Re: [Caml-list] Option functions (or lack thereof) + operator for composition

2010-11-16 Thread bluestorm
On Tue, Nov 16, 2010 at 12:27 PM, Serge Le Huitouze 
serge.lehuito...@gmail.com wrote:

 It seems that there is no predefined function to test an 'a option for
 being
 specifically None or Some _. This seems to be confirmed by the very
 existence of:
 http://ocaml-lib.sourceforge.net/doc/Option.html
 which defines such functions (is_none and is_some).
 I found it weird to be forced to use match expressions in my code for
 doing that, e.g.:
 *  let curSelectedRow = ref None in
 *  let updateButtonsStatus () =
 *  button_remove#misc#set_sensitive
 *  (match !curSelectedRow with None - false | _ - true)
 *  in
 *  ...


Though useless in this case (just use (() None)), there is a very nice
syntax extension proposal by Richard Jones to transform any pattern into a
boolean predicate : (matches p) would be equivalent to a function that
returns true if the input matches the pattern. I have implemented it in
camlp4 (the code may be slightly bitrotten) in case you're interested:
  http://bluestorm.info/camlp4/pa_matches.ml.html



 I'm not familiar with operators and their precedence, but I wonder: is it
 possible to do something similar with OCaml?


In OCaml, the associativity/precedence of an operator is defined by its
first symbols. For example (++$*) has exactly the precedence of (+). You can
find all precedence classes and their prefixes in the OCaml Manual:
  http://caml.inria.fr/pub/docs/manual-ocaml/expr.ht...@manual.kwd33
http://caml.inria.fr/pub/docs/manual-ocaml/expr.ht...@manual.kwd33Though
this is less flexible that other languages that let you choose precedence
and associativity on a case per case basis, it gives a nice homogeneity to
binary operators: you don't need to look at the operator definition site to
have a (vague, unless you know the table by hearth) idea of its syntactic
properties.
___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


Re: [Caml-list] SMP multithreading

2010-11-16 Thread Gerd Stolpmann
Am Montag, den 15.11.2010, 22:46 -0800 schrieb Edgar Friendly:
 On 11/15/2010 09:27 AM, Wolfgang Draxinger wrote:
  Hi,
 
  I've just read
  http://caml.inria.fr/pub/ml-archives/caml-list/2002/11/64c14acb90cb14bedb2cacb73338fb15.en.html
  in particular this paragraph:
  | What about hyperthreading?  Well, I believe it's the last convulsive
  | movement of SMP's corpse :-)  We'll see how it goes market-wise.  At
  | any rate, the speedups announced for hyperthreading in the Pentium 4
  | are below a factor of 1.5; probably not enough to offset the overhead
  | of making the OCaml runtime system thread-safe.
 
  This reads just like the 640k ought be enough for everyone. Multicore
  systems are the standard today. Even the cheapest consumer machines
  come with at least two cores. Once can easily get 6 core machines today.
 
  Still thinking SMP was a niche and was dying?
 
  So, what're the developments regarding SMP multithreading OCaml?
 
 
  Cheers
 
  Wolfgang
 
 At the risk of feeding a (possibly unintentional) troll, I'd like to 
 share some possibly new thoughts on this ever-living topic.
 
 It looks like high-performance computing of the near future will be 
 built out of many machines (message passing), each with many cores 
 (SMP).  One could use message passing for all communication in such a 
 system, but a hybrid approach might be best for this architecture, with 
 use of shared memory within each box and message passing between.  Of 
 course the best choice depends strongly on the particular task.
 
 In the long run, it'll likely be a combination of a few large, powerful 
 cores (Intel-CPU style w/ the capability to run a single thread as fast 
 as possible) with many many smaller compute engines (GPGPUs or the like, 
 optimized for power and area, closely coupled with memory) that provides 
 the highest performance density.
 
 The question of how to program such an architecture seems as if it's 
 being answered without the functional community's input. What can we 
 contribute?

Yes, that's generally the right question. Current hardware is a kind of
experiment - vendors have only taken the multicore path because it is
right now the easiest way of improving the performance potential,
although it is questionable whether (non-server) applications can really
benefit from it (excluding here server apps because for these
parallelization is relatively easy to get). Future hardware will
probably be even more different - however, it is still unclear which
design paths will be taken. Could be manycores (many CPUs with
non-uniform RAM), could be specialized compute units. Maybe we'll see
again a separation of consumer and datacenter markets - the former
optimizing for numeric simulation applications (i.e. games), the latter
for high-throughput data paths and parallel CPU power. The problem here
is that this is all speculation.

There are some things we can do to improve the situation (and some ideas
are not realistic):

  * A probably not-so-difficult improvement would be better message
passing between independent but local processes. I've started an
experiment for such a mechanism

(http://projects.camlcity.org/projects/dl/ocamlnet-3.0.3/doc/html-main/Netcamlbox.html),
 which tries to exploit that GC-managed memory has a well-known structure. With 
more help from the GC this could be made even better (safer, fewer corner 
cases).
  * We need more frameworks for parallel programming. I'm currently
developing Plasma, a Map/Reduce framework. Using a framework has
the big advantage that the whole program is structured so it
profits from parallelization, and that it is possible to train
developers for it that have no idea about parallelization. There
are probably more algorithm schemes where this is possible.
  * I have a lot of doubts whether FP languages ever run well on SMP
with a bigger number of cores. The problem is the relatively
high memory allocation rate - the GC has to work a lot harder
than in imperative languages. The OC4MC project uses
thread-local minor heaps because of this. Probably this is not
enough, and one even needs thread-local major heaps (plus a
third generation for values accessed by several threads). All in
all you could get the same effect by instantiating the ocaml
runtime several times (if this were possible), let each runtime
run in its own thread, and provide some extra functionality for
passing values between threads and for sharing values. This
would not be exactly the SMP model, but would allow a number of
parallelization techniques, and is probably future-proof as it
encourages message passing over sharing. This is certainly worth
experimentation.
  * One can also tackle the problem from the multi-processing side:
Provide better mechanisms for message passing (see above) and

Re: [Caml-list] OCamlJit 2.0

2010-11-16 Thread bluestorm
To those of you who are lazy but still curious, I just read the report, and
here are the answers to the question I had:

1. Is that project related to Basile Starynkevitch's venerable OCamlJIT ?

Yes, OcamlJIT was apparently a major inspiration for this work. The overall
design is similar, and in particular the objective of absolute compatibility
with the existing bytecode (even when it may hurts performances) was
retained. Unfortunately, due to bitrotting and different architectures
(OcamlJIT x86 vs. OcamlJIT2 x86_64), there is no direct performance
comparison, though the reports seems to indicate similar to better
performances. Several points were also improved (better (but still
suboptimal) register usage, clever hacks to speed up mapping between
bytecode and native code adresses...).

2. Does it use LLVM ?

No, it doesn't, but it's justified. Apparently, an early prototype showed
than LLVM compilation/generation overhead was too high. A (debatable) design
goal of OcamlJIT2.0 is to be absolutely faster than ocamlrun, even on *very*
short-running programs (`ocamlc -help`). Here is a direct quote from the
report :

 In a simple OCamlJit2 prototype based on LLVM, we have measured significant

compilation overhead; short running programs (like ocamlc applied to a small
 to medium

sized *.ml file) were three to four times slower than running the same
 program with the

byte-code interpreter. Similar results were observed by other projects using
 LLVM for

Just-In-Time compilation. For example, the Mono project10 reported
 impressive speedups

for long running, computationally intensive applications, but at the cost of
 increased com-

pilation time and memory usage.


OCamlJIT2.0 uses its own macro to generate x86_64 assembly directly
(apparently using a dedicated library wasn't worth it). A drawback of this
strategy is its inherent non-portability.


3. Does it perform well ?

As explained in the original post, the speedups against bytecode are around
2-6x. It's quite honourable, and the results are very consistent: all
benchmarked programs running for more than 1 second have speedups between
2.59 and 5.34.

The authors seems to think other optimizations are possible, and expect
significant performance improvements in the future. The goal of being on
par with the ocaml native compiler is (prudently) mentioned. On the
benchmark, ocamlopt was itself 2 to 6 times faster than OcamlJIT2.0.


4. Various remarks:

Various clever hacks are described in the report, which bring noticeable
performance improvements but probably make the implementation more complex.
I was also interested by the following remark --- maybe a bit provocative:

Both JVM and CLR depend on profile guided optimizations for good
 performance,

because their byte-code is slower than the OCaml byte-code. This is mostly
 because of

the additional overhead that comes with exception handling, multi-threading,
 class loading

and object synchronization. But it is also due to the fact that both the JVM
 byte-code [27]

and the Common Intermediate Language (CIL) [10] instruction sets are not as
 optimized

as the OCaml byte-code.




On Tue, Nov 16, 2010 at 3:52 PM, Benedikt Meurer 
benedikt.meu...@googlemail.com wrote:

 Hello everybody,

 OCamlJit 2.0 is a new Just-In-Time engine for Objective Caml 3.12.0 on
 desktop processors (x86/x86-64). It translates the OCaml byte-code used by
 the interpreter (ocamlrun and ocaml) to x86/x86-64 native code on-demand and
 runs the generated native code instead of interpreting the byte-code. It is
 designed to run with minimal compilation overhead (translating only what is
 being executed, avoiding costly code generation and optimization
 techniques), while being 100% compatible with the byte-code runtime
 (including serialization and hashing of closures, etc.).

 OCamlJit 2.0 was specifically designed for desktop processors and is not
 really portable to anything else in its current shape, because the target
 audience are people using the interactive top-level and the byte-code
 interpreter for rapid prototyping/development (which is unlikely to happen
 on anything else but x86/x86-64). The implementation currently requires a
 system that adheres to the SysV ABI, which includes Linux, BSD, OS X, but
 excludes Win32/Win64 (patches/ideas are welcome). It was tested on Linux/x86
 (Debian), Linux/amd64 (CentOS) and Mac OS X 10.6 (64bit). The x86
 implementation requires SSE2 capable processors (otherwise it falls back to
 the byte-code interpreter), so it won't speedup your OCaml programs running
 on 486 CPUs. :-)

 OCamlJit 2.0 runs most benchmarks at 2-6 times faster than the byte-code
 interpreter. The interactive top-level benefits twice when used with the JIT
 engine: (a) the compiler stages are JIT compiled and (b) the generated
 byte-code is JIT compiled. A tech report describing a slightly earlier
 prototype and including performance measures of OCamlJit 2.0 on Mac OS X
 (64bit) is available at:

  

Re: [Caml-list] OCamlJit 2.0

2010-11-16 Thread Benedikt Meurer

On Nov 16, 2010, at 18:07 , bluestorm wrote:

 To those of you who are lazy but still curious, I just read the report, and 
 here are the answers to the question I had:

Thanks for posting these points, should have done this in my original post...

 1. Is that project related to Basile Starynkevitch's venerable OCamlJIT ?
 
 Yes, OcamlJIT was apparently a major inspiration for this work. The overall 
 design is similar, and in particular the objective of absolute compatibility 
 with the existing bytecode (even when it may hurts performances) was 
 retained. Unfortunately, due to bitrotting and different architectures 
 (OcamlJIT x86 vs. OcamlJIT2 x86_64), there is no direct performance 
 comparison, though the reports seems to indicate similar to better 
 performances. Several points were also improved (better (but still 
 suboptimal) register usage, clever hacks to speed up mapping between bytecode 
 and native code adresses...).

With the x86 port being close to complete, direct comparison with OcamlJIT will 
be possible (dunno if OcamlJIT will work with 3.12 tho). I'll do appropriate 
comparisons once everything is in place.

 2. Does it use LLVM ?
 
 No, it doesn't, but it's justified. Apparently, an early prototype showed 
 than LLVM compilation/generation overhead was too high. A (debatable) design 
 goal of OcamlJIT2.0 is to be absolutely faster than ocamlrun, even on *very* 
 short-running programs (`ocamlc -help`).

This is indeed debatable, atleast for ocamlc -help. But this was not the main 
concern with LLVM. LLVM overhead is acceptable for long running computations, 
but everything else is slowed down noticably. It should also be noted that my 
LLVM prototype was rather quickdirty, so it may indeed be possible to get a 
LLVM based JIT which is on par with the byte-code interpreter for common 
applications. But why would one want to do this? Long running computations can 
be speed up very well using ocamlopt (no need to perform them using the 
interactive top-level).

 OCamlJIT2.0 uses its own macro to generate x86_64 assembly directly 
 (apparently using a dedicated library wasn't worth it). A drawback of this 
 strategy is its inherent non-portability.

It used AsmJit before, which is also limited to x86/x86-64; the non-portability 
is actually a design decision (as explained in my original post, portability to 
non-desktop machines was not a goal for the current implementation).

HTH,
Benedikt Meurer
___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


Re: [Caml-list] Option functions (or lack thereof) + operator for composition

2010-11-16 Thread Martin Jambon
On 11/16/10 03:51, Gabriel Kerneis wrote:
 On Tue, Nov 16, 2010 at 12:43:30PM +0100, Thomas Gazagnaire wrote:
 You are not forced to use match expression, you can just define :
 let is_none x = match x with None - true | Some _ - false
 and is_some x = not (is_none x)
 and then use these functions in your code ...
 
 Or even simpler:
   let is_none = (=) None
   let is_some = (!=) None
 Note that you can use this directly, for instance:
   List.filter ((!=) None) l

I would like to add that I also follow this general approach consisting
in defining such trivial functions where I need them.  Being trivial
does not mean that they are going to be used everywhere.

It makes the code easier to *read*.

My 2 cents.


Martin

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


Re: [Caml-list] SMP multithreading

2010-11-16 Thread Norman Hardy

On 2010 Nov 15, at 22:46 , Edgar Friendly wrote:

 It looks like high-performance computing of the near future will be built out 
 of many machines (message passing), each with many cores (SMP).  One could 
 use message passing for all communication in such a system, but a hybrid 
 approach might be best for this architecture, with use of shared memory 
 within each box and message passing between.  Of course the best choice 
 depends strongly on the particular task.
 
 In the long run, it'll likely be a combination of a few large, powerful cores 
 (Intel-CPU style w/ the capability to run a single thread as fast as 
 possible) with many many smaller compute engines (GPGPUs or the like, 
 optimized for power and area, closely coupled with memory) that provides the 
 highest performance density.

OCaml code should be able to share immutable OCaml data with other processes 
just as it shares libraries.
See http://cap-lore.com/Software/pch.html .
Some of the ideas there might be improved with hardware support.

Admission: If I had read all of the interesting pointers given on this thread I 
would never finish sending this e-mail.
___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


Re: [Caml-list] SMP multithreading

2010-11-16 Thread Eray Ozkural
On Tue, Nov 16, 2010 at 7:04 PM, Gerd Stolpmann i...@gerd-stolpmann.dewrote:

 Am Montag, den 15.11.2010, 22:46 -0800 schrieb Edgar Friendly:
  * As somebody mentioned implicit parallelization: Don't expect
anything from this. Even if a good compiler finds ways to
parallelize 20% of the code (which would be a lot), the runtime
effect would be marginal. 80% of the code is run at normal speed
(hopefully) and dominates the runtime behavior. The point is
that such compiler-driven code improvements are only local
optimizations. For getting good parallelization results you need
to restructure the design of the program - well, maybe
compiler2.0 can do this at some time, but this is not in sight.


I think you are underestimating parallelizing compilers.


-- 
Eray Ozkural, PhD candidate.  Comp. Sci. Dept., Bilkent University, Ankara
http://groups.yahoo.com/group/ai-philosophy
http://myspace.com/arizanesil http://myspace.com/malfunct
___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


Re: [Caml-list] SMP multithreading

2010-11-16 Thread Gerd Stolpmann
Am Dienstag, den 16.11.2010, 22:35 +0200 schrieb Eray Ozkural:
 
 
 On Tue, Nov 16, 2010 at 7:04 PM, Gerd Stolpmann
 i...@gerd-stolpmann.de wrote:
 Am Montag, den 15.11.2010, 22:46 -0800 schrieb Edgar
 Friendly: 
  * As somebody mentioned implicit parallelization: Don't
 expect
anything from this. Even if a good compiler finds ways
 to
parallelize 20% of the code (which would be a lot), the
 runtime
effect would be marginal. 80% of the code is run at
 normal speed
(hopefully) and dominates the runtime behavior. The
 point is
that such compiler-driven code improvements are only
 local
optimizations. For getting good parallelization results
 you need
to restructure the design of the program - well, maybe
compiler2.0 can do this at some time, but this is not
 in sight. 
 
 I think you are underestimating parallelizing compilers. 

I was more citing Amdahl's law, and did not want to criticize any effort
in this area. It's more the usefulness for the majority of problems. How
useful is the best parallelizing compiler if only a small part of the
program _can_ actually benefit from it? Think about it. If you are not
working in an area where many subroutines can be sped up, you consider
this way of parallelizing as a waste of time. And this is still true for
the majority of problems. Also, for many problems that can be tackled,
the scalability is very limited.

Gerd 
-- 

Gerd Stolpmann, Bad Nauheimer Str.3, 64289 Darmstadt,Germany 
g...@gerd-stolpmann.de  http://www.gerd-stolpmann.de
Phone: +49-6151-153855  Fax: +49-6151-997714


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


Re: [Caml-list] SMP multithreading

2010-11-16 Thread Eray Ozkural
On Wed, Nov 17, 2010 at 12:13 AM, Gerd Stolpmann i...@gerd-stolpmann.dewrote:

 Am Dienstag, den 16.11.2010, 22:35 +0200 schrieb Eray Ozkural:
 
 
  On Tue, Nov 16, 2010 at 7:04 PM, Gerd Stolpmann
  i...@gerd-stolpmann.de wrote:
  Am Montag, den 15.11.2010, 22:46 -0800 schrieb Edgar
  Friendly:
   * As somebody mentioned implicit parallelization: Don't
  expect
 anything from this. Even if a good compiler finds ways
  to
 parallelize 20% of the code (which would be a lot), the
  runtime
 effect would be marginal. 80% of the code is run at
  normal speed
 (hopefully) and dominates the runtime behavior. The
  point is
 that such compiler-driven code improvements are only
  local
 optimizations. For getting good parallelization results
  you need
 to restructure the design of the program - well, maybe
 compiler2.0 can do this at some time, but this is not
  in sight.
 
  I think you are underestimating parallelizing compilers.

 I was more citing Amdahl's law, and did not want to criticize any effort
 in this area. It's more the usefulness for the majority of problems. How
 useful is the best parallelizing compiler if only a small part of the
 program _can_ actually benefit from it? Think about it. If you are not
 working in an area where many subroutines can be sped up, you consider
 this way of parallelizing as a waste of time. And this is still true for
 the majority of problems. Also, for many problems that can be tackled,
 the scalability is very limited.



What makes you think only 20% of the code can be parallelized? I've worked
in such a compiler project, and there were way too many opportunities for
parallelization in ordinary C code, let alone a functional language;
implicit parallelism would work wonders there. Of course you may think
whatever you were thinking, but I know that a high degree of parallelism can
be achieved through a functional language. I can't tell you much more
though. If you think that a very small portion of the code can be
parallelized you probably do not appreciate the kinds of static and dynamic
analysis those compilers perform. Of course if you are thinking of applying
it to some office or e-mail application, this might not be the case, but
automatic parallelization strategies would work best when you apply them to
a computationally intensive program.

The really limiting factor for current functional languages would be their
reliance on inherently sequential primitives like list processing, which may
in some cases limit the compiler to only pipeline parallelism (something
non-trivial to do by hand actually). Instead they would have to get their
basic forms from high-level parallel PLs (which might mean rewriting a lot
of things). The programs could look like something out of a category theory
textbook then. But I think even without modification you could get a lot of
speedup from ocaml code by applying state-of-the-art automatic
parallelization techniques. By the way, how much of the serial work is
parallelized that's what matters rather than the code length that is. Even
if the parallelism is not so obvious, the analysis can find out which
iterations are parallelizable, which variables have which kinds of
dependencies, memory dependencies, etc. etc. In the public, there is this
perception that automatic parallelization does not work, which is wrong,
while it is true that the popular compiler vendors do not understand much in
this regard, all I've seen (in popular products) are lame attempts to
generate vector instructions

That is not to say, that implicit parallelization of current code can
replace parallel algorithm design (which is AI-complete). Rather, I think,
implicit parallelization is one of the things that will help parallel
computing people: by having them work at a more abstract level, exposing
more concurrency through functional forms, yet avoiding writing low-level
comms code. High-level explicit parallelism is also quite favorable, as
there may be many situations where, say, dynamic load-balancing approaches
are suitable. The approach of HPF might be relevant here, with the
programmer making annotations to guide the distribution of data structures,
and the rest inferred from code.

So whoever says that isn't possible, probably hasn't read up much in the
computer architecture community. Probably even expert PL researchers may be
misled here, as they have made the quoted remark or similar remarks about
multi-core/SMP  architectures. It was known for a very long time (20+ years)
that the clock speeds would hit a wall and then we'd have to expend more
area. This is true regardless of the underlying architecture/technology
actually. Mother nature is parallel. It is the sequence that is an
abstraction. And I wonder what is more natural than 

Re: [Caml-list] SMP multithreading

2010-11-16 Thread Wolfgang Draxinger
On Wed, 17 Nov 2010 01:04:54 +0200
Eray Ozkural examach...@gmail.com wrote:

 [readworthy text]

I'd like to point out how the big competitor to OCaml deals with it.
The GHC Haskell system has SMP parallization built in for some time,
and it does it quite well.


Wolfgang

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


Re: [Caml-list] SMP multithreading

2010-11-16 Thread Eray Ozkural
On Wed, Nov 17, 2010 at 1:52 AM, Wolfgang Draxinger 
wdraxinger.maill...@draxit.de wrote:

 On Wed, 17 Nov 2010 01:04:54 +0200
 Eray Ozkural examach...@gmail.com wrote:

  [readworthy text]

 I'd like to point out how the big competitor to OCaml deals with it.
 The GHC Haskell system has SMP parallization built in for some time,
 and it does it quite well.


I think I tested the parallel features just once in the distant past,
something like that would be so useful for ocaml :) Explicit threading that
is suited to functional programming with a syntax independent of the actual
thread implementation. The par combinator looks like fun to use. Wishlist
item definitely :)

Cheers,

-- 
Eray Ozkural, PhD candidate.  Comp. Sci. Dept., Bilkent University, Ankara
http://groups.yahoo.com/group/ai-philosophy
http://myspace.com/arizanesil http://myspace.com/malfunct
___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


RE: [Caml-list] SMP multithreading

2010-11-16 Thread Jon Harrop
Wolfgang wrote:
 I'd like to point out how the big competitor to OCaml deals with it.
 The GHC Haskell system has SMP parallization built in for some time,
 and it does it quite well.

I beg to differ. Upon trying to reproduce many of the Haskell community's
results, I found that even their own parallel Haskell programs often exhibit
huge slowdowns. This is because Haskell's unpredictable performance leads to
unpredictable granularity and, consequently, more time can be spent
administering the tiny parallel computations than is gained by doing so.

The results I found here are typical:

 
http://flyingfrogblog.blogspot.com/2010/01/naive-parallelism-with-hlvm.html

Note that the absolute performance peaks at an unpredictable number of cores
only in the case of Haskell. This is because the GC does not scale beyond
about 4 cores for any Haskell programs doing significant amounts of
allocation, which is basically all Haskell programs because allocations are
everywhere in Haskell.

Ultimately, running on all cores attains no speedup at all with Haskell in
that case. This was branded the last core slowdown but the slowdown
clearly started well before all 8 cores. There was a significant development
towards improving this situation but it won't fix the granularity problem:

  http://hackage.haskell.org/trac/ghc/blog/new-gc-preview

The paper Regular, shape-polymorphic, parallel arrays in Haskell cites
2.5x speedups when existing techniques were not only already getting 7x
speedups but better absolute performance as well. Cache complexity is the
problem, as I explained here:

 
http://flyingfrogblog.blogspot.com/2010/06/regular-shape-polymorphic-paralle
l.html

Probably the best solution for multicore programming is Cilk. This technique
has already been adopted both in Intel's TBB and Microsoft's .NET 4 but,
AFAIK, the only functional language with access to it is F#. There are some
great papers on multicore-friendly cache oblivious algorithms written in
Cilk:

  http://www.fftw.org/~athena/papers/tocs08.pdf

Note, in particular, that Cilk is not only much faster but also much easier
to use than explicit message passing.

To do something like this, threads need to be able to run in parallel and
mutate the same shared heap. Although that is objectively easy (I did it in
HLVM), OCaml's reliance upon very high allocation rates, efficient
collection of young garbage and a ridiculous density of pointers in the heap
make it a *lot* harder.

Cheers,
Jon.


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


RE: [Caml-list] SMP multithreading

2010-11-16 Thread Jon Harrop
Granularity and cache complexity are the reasons why not. If you find
anything and everything that can be done in parallel and parallelize it then
you generally obtain only slowdowns. An essential trick is to exploit
locality via mutation but, of course, purely functional programming sucks at
that by design not least because it is striving to abstract that concept
away.

 

I share your dream but I doubt it will ever be realized.

 

Cheers,

Jon.

 

From: caml-list-boun...@yquem.inria.fr
[mailto:caml-list-boun...@yquem.inria.fr] On Behalf Of Eray Ozkural
Sent: 16 November 2010 23:05
To: Gerd Stolpmann
Cc: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] SMP multithreading

 

 

On Wed, Nov 17, 2010 at 12:13 AM, Gerd Stolpmann i...@gerd-stolpmann.de
wrote:

Am Dienstag, den 16.11.2010, 22:35 +0200 schrieb Eray Ozkural:



 On Tue, Nov 16, 2010 at 7:04 PM, Gerd Stolpmann
 i...@gerd-stolpmann.de wrote:
 Am Montag, den 15.11.2010, 22:46 -0800 schrieb Edgar
 Friendly:
  * As somebody mentioned implicit parallelization: Don't
 expect
anything from this. Even if a good compiler finds ways
 to
parallelize 20% of the code (which would be a lot), the
 runtime
effect would be marginal. 80% of the code is run at
 normal speed
(hopefully) and dominates the runtime behavior. The
 point is
that such compiler-driven code improvements are only
 local
optimizations. For getting good parallelization results
 you need
to restructure the design of the program - well, maybe
compiler2.0 can do this at some time, but this is not
 in sight.

 I think you are underestimating parallelizing compilers.

I was more citing Amdahl's law, and did not want to criticize any effort
in this area. It's more the usefulness for the majority of problems. How
useful is the best parallelizing compiler if only a small part of the
program _can_ actually benefit from it? Think about it. If you are not
working in an area where many subroutines can be sped up, you consider
this way of parallelizing as a waste of time. And this is still true for
the majority of problems. Also, for many problems that can be tackled,
the scalability is very limited.
  


What makes you think only 20% of the code can be parallelized? I've worked
in such a compiler project, and there were way too many opportunities for
parallelization in ordinary C code, let alone a functional language;
implicit parallelism would work wonders there. Of course you may think
whatever you were thinking, but I know that a high degree of parallelism can
be achieved through a functional language. I can't tell you much more
though. If you think that a very small portion of the code can be
parallelized you probably do not appreciate the kinds of static and dynamic
analysis those compilers perform. Of course if you are thinking of applying
it to some office or e-mail application, this might not be the case, but
automatic parallelization strategies would work best when you apply them to
a computationally intensive program.

The really limiting factor for current functional languages would be their
reliance on inherently sequential primitives like list processing, which may
in some cases limit the compiler to only pipeline parallelism (something
non-trivial to do by hand actually). Instead they would have to get their
basic forms from high-level parallel PLs (which might mean rewriting a lot
of things). The programs could look like something out of a category theory
textbook then. But I think even without modification you could get a lot of
speedup from ocaml code by applying state-of-the-art automatic
parallelization techniques. By the way, how much of the serial work is
parallelized that's what matters rather than the code length that is. Even
if the parallelism is not so obvious, the analysis can find out which
iterations are parallelizable, which variables have which kinds of
dependencies, memory dependencies, etc. etc. In the public, there is this
perception that automatic parallelization does not work, which is wrong,
while it is true that the popular compiler vendors do not understand much in
this regard, all I've seen (in popular products) are lame attempts to
generate vector instructions

That is not to say, that implicit parallelization of current code can
replace parallel algorithm design (which is AI-complete). Rather, I think,
implicit parallelization is one of the things that will help parallel
computing people: by having them work at a more abstract level, exposing
more concurrency through functional forms, yet avoiding writing low-level
comms code. High-level explicit parallelism is also quite favorable, as
there may be many situations where, say, dynamic load-balancing approaches
are suitable. The approach of HPF might be relevant here, with the
programmer 

[Caml-list] Lightweight way to extend existing module. Almost there?

2010-11-16 Thread Till Varoquaux
With the improvements to the module system in ocaml 3.12 extending a
signature just got much much lighter. Suppose I want the signature of
the module which in every respects is like StdLabels excepted that the
String module has an extra is_prefix_of I can write:

  module String : sig
include module type of StdLabels.String
val is_prefix_of : string -  string -  bool
  end

  include module type of StdLabels with module String :=  String

I know of no way to extend the implementation in a similarly concise
way (for which the length of the solution does not depend on the
number of top level items in StdLabels). Have I overlooked anything
obvious?

Till

P.S.: FYI existing solution for the implementation:

  module String = struct
include StringLabels

let is_prefix_of (needle:string) (haystack:string) : bool =
  let len = length needle in
  len = length haystack  (sub haystack ~pos:0 ~len = needle)
  end

  (* Haven't quite found out how to include a module hiding another one... *)
  module List = ListLabels
  module Array = ArrayLabels

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


Re: [Caml-list] SMP multithreading

2010-11-16 Thread Eray Ozkural
Oh well, I'm not so surprised that the fine-grain task-parallelism with (?)
dynamic load-balancing strategy doesn't get much speedup. Doing HPC with
Haskell is a bit like using Java for writing parallel programs, you might as
well use a C-64 and Commodore BASIC. And yes, some people do use Java with
MPI. Java people have benchmarks too :) But for some reason I had difficulty
using Java and Haskell even with medium size problems.

On the other hand, a lot more can be achieved with a parallelizing compiler
that uses profiling and static analysis. As I said even in C good results
can be achieved, I've seen that, so I know it's doable with ocaml, just a
difficult kind of compiler. The functional features would expose more
concurrency.

At any rate, implicit parallelism isn't the same as a parallelizing
compiler, it's better, because you would be using primitives, that the
compiler knows to its heart. That's like combining the best of both worlds,
I think, because obviously parallelizing compilers can work best on the
easier kinds of parallelism. It can pull more tricks than many assume, but
it would still not replace a parallel algorithm designer.  You don't really
expect to give quicksort as input and get hypercube quicksort as output in
these parallelizing compilers that apply a number of heuristic
transformations to the code, but in many problems they can be made to
generate pretty good code. The best part is once you have it you can apply
it to every program, it's one of the cheapest ways to get speedup, so I'd
say it's worthwhile for ocaml right now. Just not the way GHC does.

On Wed, Nov 17, 2010 at 5:47 AM, Jon Harrop 
jonathandeanhar...@googlemail.com wrote:

 Granularity and cache complexity are the reasons why not. If you find
 anything and everything that can be done in parallel and parallelize it then
 you generally obtain only slowdowns. An essential trick is to exploit
 locality via mutation but, of course, purely functional programming sucks at
 that by design not least because it is striving to abstract that concept
 away.



 I share your dream but I doubt it will ever be realized.



 Cheers,

 Jon.



 *From:* caml-list-boun...@yquem.inria.fr [mailto:
 caml-list-boun...@yquem.inria.fr] *On Behalf Of *Eray Ozkural
 *Sent:* 16 November 2010 23:05
 *To:* Gerd Stolpmann
 *Cc:* caml-list@yquem.inria.fr
 *Subject:* Re: [Caml-list] SMP multithreading





 On Wed, Nov 17, 2010 at 12:13 AM, Gerd Stolpmann i...@gerd-stolpmann.de
 wrote:

 Am Dienstag, den 16.11.2010, 22:35 +0200 schrieb Eray Ozkural:

 
 
  On Tue, Nov 16, 2010 at 7:04 PM, Gerd Stolpmann
  i...@gerd-stolpmann.de wrote:
  Am Montag, den 15.11.2010, 22:46 -0800 schrieb Edgar
  Friendly:
   * As somebody mentioned implicit parallelization: Don't
  expect
 anything from this. Even if a good compiler finds ways
  to
 parallelize 20% of the code (which would be a lot), the
  runtime
 effect would be marginal. 80% of the code is run at
  normal speed
 (hopefully) and dominates the runtime behavior. The
  point is
 that such compiler-driven code improvements are only
  local
 optimizations. For getting good parallelization results
  you need
 to restructure the design of the program - well, maybe
 compiler2.0 can do this at some time, but this is not
  in sight.
 
  I think you are underestimating parallelizing compilers.

 I was more citing Amdahl's law, and did not want to criticize any effort
 in this area. It's more the usefulness for the majority of problems. How
 useful is the best parallelizing compiler if only a small part of the
 program _can_ actually benefit from it? Think about it. If you are not
 working in an area where many subroutines can be sped up, you consider
 this way of parallelizing as a waste of time. And this is still true for
 the majority of problems. Also, for many problems that can be tackled,
 the scalability is very limited.



 What makes you think only 20% of the code can be parallelized? I've worked
 in such a compiler project, and there were way too many opportunities for
 parallelization in ordinary C code, let alone a functional language;
 implicit parallelism would work wonders there. Of course you may think
 whatever you were thinking, but I know that a high degree of parallelism can
 be achieved through a functional language. I can't tell you much more
 though. If you think that a very small portion of the code can be
 parallelized you probably do not appreciate the kinds of static and dynamic
 analysis those compilers perform. Of course if you are thinking of applying
 it to some office or e-mail application, this might not be the case, but
 automatic parallelization strategies would work best when you apply them to
 a computationally intensive program.

 The really limiting factor 

Re: [Caml-list] SMP multithreading

2010-11-16 Thread Gabriel Kerneis
On Wed, Nov 17, 2010 at 06:27:14AM +0200, Eray Ozkural wrote:
 As I said even in C good results can be achieved, I've seen that, so I
 know it's doable with ocaml, just a difficult kind of compiler. The
 functional features would expose more concurrency.

Could you share a pointer to a paper describing this compiler?

Thanks,
-- 
Gabriel Kerneis

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs