[Caml-list] Re: thousands of CPU cores
On 10-07-2008, Oliver Bandel [EMAIL PROTECTED] wrote: Using multi-processes instead of multi-threads is the usual way on Unix, and it has a lot of advantages. Threads-apologetes often say, threads are the ultimative technology... but processes have the advantage of encapsulation of the wohole environment of the program. There is also the fact that using multi process allow you to go further than the memory limit per process (3GB for Linux/ 1GB for Windows). With the actual increase of amount of RAM, this can be an issue. For some time, most of the vendor are selling computer with at least 1GB and often 2GB or more. Regards, Sylvain Le Gall ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
[Caml-list] Re: thousands of CPU cores
On 11-07-2008, Jon Harrop [EMAIL PROTECTED] wrote: On Friday 11 July 2008 07:26:44 Sylvain Le Gall wrote: On 10-07-2008, Oliver Bandel [EMAIL PROTECTED] wrote: However, any serious power users will already be on 64-bit where these limits have been relegated to quaint stories your grandpa will tell you. As you cannot ignore people running on Windows, you cannot ignore people running on older hardware. If you plan to program a big DB that will use more than 3GB on 32 bits hardware, you should take care of this memory limit and consider splitting your application into several process... The process approach to parallelism: - is basic but should fit to most OS around - require work to split application correctly, wrt to require communication bandwidth - cannot take advantage of shared memory (well you CAN share memory, but it is not as easy as in thread/single process) - increase safety by really separating data I mean, you can get really good performance with threaded app BUT you have many drawbacks that create weird behavior/undetectable runtime bugs. Process approach is portable and limit possible bugs to communication... Regards, Sylvain Le Gall ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Heaps size problems with caml_alloc_small in foreign function interfaces
On Fri, Jul 11, 2008 at 10:40:50AM +0100, Richard Jones wrote: v = caml_alloc (2, 0); vv = caml_alloc (3, 0); /* GC could run here */ Ick, actually caml_alloc is OK, it's only caml_alloc_small which doesn't initialize. Rich. -- Richard Jones Red Hat ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] thousands of CPU cores
Am Donnerstag, den 10.07.2008, 23:01 -0400 schrieb Brian Hurt: On Thu, 10 Jul 2008, Gerd Stolpmann wrote: I wouldn't take this article too seriously. It's just speculation. I would take the article seriously. Just open up your mind to this perspective: It's a big risk for the CPU vendors to haven taken the direction to multi-core. *Precisely*. It also stands in stark contrast to the last 50 or so years of CPU development, which focused around making single-threaded code faster. And, I note, it's not just one CPU manufacturer who has done this (which could be chalked up to stupid management or stupid engineers)- but *every* CPU manufacturer. And what do they get out of it, other than ticked off software developers grumbling about having to switch to multithreaded code? I can only see one explanation: they had no choice. They couldn't make single threaded code any faster by throwing more transistors at the problem. We've maxed out speculative execution and instruction level parallelism, pushed cache out well past the point of diminishing returns, and added all the execution units that'll ever be used, what more can be done? The only way left to increase speed is multicore. And you still have the steady drum beat of Moore's law- which, by the way, only gaurentees that the number of transistors per square millimeter doubles every yeah so often (I think the current number is 2 years). So we have the new process which gives us twice the number of transistors as the old process, but nothing we can use those new transistors on to make single threaded code go any faster. So you might as well give them two cores where they used to have one. At this point, there are only two things that can prevent kilo-core chips: one, some bright boy could come up with some way to use those extra transistors to make single-threaded code go faster, or two: Moore's law could expire within the next 16 years. We're at quad-core already, another 8 doublings every 2 years, with all doublings spent on more cores, gets us to 1024 cores. Well, it is an open question whether this alternative holds. I mean there is a market, and if the market says, no we don't need that multicore monsters, the chip companies cannot simply ignore it. Of course, there are applications that would extremely benefit from them, but it is the question whether this is only a niche, or something you can make enough revenues to pay the development of such large multicores. In the past, it was very important for hardware vendors that existing software runs quicker on new CPU generations. This is no longer true for multicore. So unless there is a software revolution that makes it simple to exploit multicore, we won't see 1024-cores for the masses. I think it'll be worse than this, actually, once it gets going. The Pentium III (single core) was 9.5 million transistors, while the Core Duo was 291 million. Even accounting for the 2 cores and some cost to put them together, you're looking at the P-III to be 1/16th the size of a Core. If put on the same process so the P-III runs at more or less the same clock speed, how much slower would the P-III be? 1/10th? 1/2? 90% the speed of the Core? So long as it's above 1/16th the speed, you're ahead. If your code can scale that far, it's worthwhile to have 32 P-III cores instead of a the dual core Core Duo- it's faster. Yes, there are limits to this (memory bandwidth springs to mind), but the point is that more, simpler cores could be a performance win, increasing the rate cores multiply faster than Moore's law would dictate. If we decide to go to P-III cores instead of Core cores, we could have 1024-core chips comming out in 8 years (4 doublings, not 8, to go from 4x32=64 cores to 1024 cores). And remember, if Moore's law holds out for another 8 years after we hit 1K cores, that's another 4 doublings, and we're looking at CPUs with 16K cores- literally, tens of thousands of cores. Well, there is a another option for the chip industry. Instead of keeping the die at some size and packing more and more cores on it, they can also sell smaller chips for less. Basically, this alternate path already exists (e.g. Intel's Atom). Of course, this makes this industry more boring, and they would turn into more normal industrial component suppliers. At some point this will be unavoidable. The question is whether this happens in the next years. Gerd If Moore's law doesn't hold up, that's going to be a different, and much larger and smellier, pile of fecal matter to hit the rotary air impeller. Brian ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs --
Re: [Caml-list] Problem of compilation with ocamlopt
I made some code wich compil without any problem with ocamlc. When I try to compil with ocamlopt on a first computer where the version of ocamlopt 3.09.1 I have the following message: Please submit a proper bug report on the bug tracking system, including code that reproduces the issue. Make sure to mention the OS and the architecture (x86 or x86-64). I'll look into it. - Xavier Leroy ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] thousands of CPU cores
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Thursday 10 July 2008 10:00:02 am Jon Harrop wrote: Today's biggest shared-memory supercomputers already have thousands of cores. Also, this is a CNET article.. not exactly known for being in depth or well researched and this article is no exception. It is an article based entirely on a few speculative comments of some Intel guys. I wouldn't take it too seriously. Personally, I can see why the Caml development team opted not to put effort into dealing with shared-memory systems. The OCaml development team put huge effort into their concurrent run-time. No, don't get me wrong, I'm all about concurrency and I'm glad the OCaml dev team put a lot of effort into it. I'm talking about specific optimizations for shared-memory architectures. It is a stop-gap solution... That is not true. Many-core machines will always be decomposed into shared-memory clusters of as many cores as possible because shared memory parallelism will always be orders of magnitude more efficient than distributed parallelism. Hmm... that's a good point. Although, I want to point out that parallel algorithm design (and hardware design) isn't nearly as well studied. Peng -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.7 (GNU/Linux) iD8DBQFId3PnfIRcEFL/JewRApH6AKDBI5Wd95Gc6YIt/nvU41lIdiaw2ACfcONA YX8PCVBkcnSYkN3R8MC1yys= =rkJx -END PGP SIGNATURE- ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] thousands of CPU cores
On Fri, 2008-07-11 at 16:06 +0200, Xavier Leroy wrote: . . . The interesting question that this community should focus on (rather than throwing fits about concurrent GC and the like) is coming up with good programming models for parallelism. I'm quite fond of message passing myself, but agree that more constrained data-parallel models have value as well. As Gerd Stolpmann mentioned, various forms of message passing can be exploited from OCaml today, but there is certainly room for improvement there. Perhaps this is subsumed by some of the terminology flying around this discussion, but what about (synchronous) dataflow? I had some pretty good-looking preliminary results implementing telecom algorithms in dataflow networks. One nice side-effect was that latency and throughput could be tied to the aspect ratio (length vs. breadth) of the dataflow network. This could be an opening for the design-space trade-off design style that hardware designers are used to but that is rare in software. The resulting designs look upside-down to software designers -- instead of a few big processes doing complicated work and communicating/coordinating with each other there is a large number of small functions each doing its thing to the next item on its input queue and passing it on. -- Bill Wood ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] thousands of CPU cores
On Friday 11 July 2008 15:03:48 Basile STARYNKEVITCH wrote: As a case in point, I suggest an experiment (which unfortunately I don't have the time or motivation to realize). Replace the current Ocaml GC either in bytecode or in nativecode ocaml by Boehm's collector (which is multithread compatible). Now that I come to think of it, doesn't OCaml extensively break Boehm's assumptions, e.g. that pointer-like values refer to the start of an allocated block? So Boehm is likely to not collect anything. -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/products/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] thousands of CPU cores
[...] There are good reasons to think that the illusion of shared memory cannot be maintained in the presence of hundreds of computing elements, even using cc-NUMA techniques (i.e. hardware emulation of shared memory on top of high-speed point-to-point links). I'm not arguing any of your points but just note that larger NUMA machines than that are available and sometimes practical - SGI Altix go up to 1024 cores with a single system image. (To answer Richard Jones's question, I know Bea have tested their JVM on such a machine but I have no idea whether it turned out to be useful. I doubt there are many Java applications actually needing such a wide JVM.) ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
[Caml-list] memory usage
Dear list members, I am trying to run a stochastic simulator (written in ocaml) on a huge data set and I have the following error message: sim(9595) malloc: *** mmap(size=1048576) failed (error code=12) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug Fatal error: out of memory. My system: Mac Pro running OS X 10.5.4 Processor: 2 x 2.8 GHz Quad-Core Intel Xeon Memory: 10 GB 800 MHz DDR2 FB-DIMM Does someone know what happened? Do you have any idea of any parameter I could tune in order to avoid that? Thank you very much! Jean ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
[Caml-list] Troublesome nodes
Hi, This problem was originally raised in a thread in the ocaml-beginners list [1], but since polymorphic variants, covariant constraints, and recursive knots were brought into the discussion, I reckon it deserves the attention of some heavy weights. Moreover, the problem is trickier than first appearances suggest. So, what's the situation? I want to create a data structure holding document nodes. There are four different kinds of nodes, two of which are terminals (Text and See), and two of which are defined recursively (Bold and Mref). Moreover, both See and Mref produce links, and there is an additional constraint that a link node may *not* be the immediate ancestor of another link node. Using conventional union types, a node could be modelled like this: module Old_node = struct type seq_t = super_node_t list and super_node_t = | Nonlink_node of nonlink_node_t | Link_node of link_node_t and nonlink_node_t = | Text of string | Bold of seq_t and link_node_t = | Mref of string * nonlink_node_t list | See of string end The problem with this representation is that it introduces an unwanted scaffolding for nodes. Moreover, it prevents the use of constructor functions for nodes, since non-link nodes may be represented in the tree in a context-dependent fashion: either directly such as Bold [...], or as Nonlink_node (Bold [...]). Note that preserving the link/nonlink distinction in the structure is helpful for pattern matching purposes, but the extra scaffolding is just a pain. One alternative is to use polymorphic variants, and to take advantage of the fact that new types can be built as the union of existing ones. Ideally, one could do something like this: type seq_t = super_node_t list and nonlink_node_t = [ `Text of string | `Bold of seq_t ] and link_node_t = [ Mref of string * nonlink_node_t list | See of string ] and super_node_t = [nonlink_node_t | link_node_t] However, this fails with an error The type constructor nonlink_node_t is not yet completely defined. Jon Harrop suggested untying the recursive knot, but the solution has a few drawbacks of its own [2]. Another alternative is to flatten the structure altogether and to annotate the constructor functions with phantom types to prevent the violation of the no-parent constraint: module Node: sig type seq_t = node_t list and node_t = private | Text of string | Bold of seq_t | Mref of string * seq_t | See of string type +'a t val text: string - [ `Nonlink] t val bold: 'a t list - [ `Nonlink] t val mref: string - [ `Nonlink] t list - [ `Link] t val see: string - [ `Link] t end = struct type seq_t = node_t list and node_t = | Text of string | Bold of seq_t | Mref of string * seq_t | See of string type +'a t = node_t let text txt = Text txt let bold inl = Bold inl let mref ref inl = Mref (ref, inl) let see ref = See ref end This works fine, but because the link/nonlink distinction is lost, making even a simple Node_to_Node translator becomes a mess: module Node_to_Node = struct let rec convert_nonlink_node = function | Node.Text txt - Node.text txt | Node.Bold inl - Node.bold (List.map convert_super_node inl) | _ - failwith oops and convert_link_node = function | Node.Mref (ref, inl) - Node.mref ref (List.map convert_nonlink_node inl) | Node.See ref - Node.see ref | _ - failwith oops and convert_super_node node = match node with | Node.Text _ | Node.Bold _- (convert_nonlink_node node : [`Link | `Nonlink] Node.t) | Node.See _ | Node.Mref _- convert_link_node node end So, I am looking for a solution that meets the following conditions: - It satisfies the no link node shall be parent of another constraint; - the structure should be pattern-matchable; - but nodes should be created via constructor functions. Any ideas? Thanks in advance and sorry for the long post! Cheers, Dario Teixeira [1] http://tech.groups.yahoo.com/group/ocaml_beginners/message/9930 [2] http://tech.groups.yahoo.com/group/ocaml_beginners/message/9932 __ Not happy with your email address?. Get the one you really want - millions of new email addresses available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] memory usage
It is hard to tell without any more informations but sometimes the garbage collector needs some gentle proding: OCaml handles it's own memory but can be a bad citizen when it comes to making room for others. Unfortunately ocaml also has a bit of a double personality: it doesn't know much about resources used in external libraries or even in some of its own library (e.g. on a 32 bits machine running out of addressable space because of Bigarray.map_file is not unheard of). If this is your problem, you can either sprinkle your source code with calls to Gc.major or tweak it using Gc.set. Till On Fri, Jul 11, 2008 at 8:49 PM, Jean Krivine [EMAIL PROTECTED] wrote: Dear list members, I am trying to run a stochastic simulator (written in ocaml) on a huge data set and I have the following error message: sim(9595) malloc: *** mmap(size=1048576) failed (error code=12) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug Fatal error: out of memory. My system: Mac Pro running OS X 10.5.4 Processor: 2 x 2.8 GHz Quad-Core Intel Xeon Memory: 10 GB 800 MHz DDR2 FB-DIMM Does someone know what happened? Do you have any idea of any parameter I could tune in order to avoid that? Thank you very much! Jean ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs -- http://till-varoquaux.blogspot.com/ ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
[Caml-list] newbie ocaml and delimited continuations
hello all, I am a newbie currently learning ocaml and am particularly interested in the new continuations package. there is announcement here - which include a tarball source patch. (Persistent twice-delimited continuations and CGI programming) http://groups.google.co.uk/group/fa.caml/browse_thread/thread/cac8628ae8ef191d The value of continuations and coroutines are that they make it trivial to map asynchronous operations to synchronous programming models , they have general application in network stacks/ generators and simulation models. (ie the benefit of threads without the locking and context switching cost). My questions are as follows. Is it likely that this package may be incorporated into a later official relase of ocaml (ie receive first class support ) ?. continuations would appear to offer considerable programming expressiveness even if not widely supported in other languages (cf scheme). And a very basic tech question - It is obviously necessary that the package runs as interpreted bytecode - so that the activation record of the function/closure may be saved/serialized off. However, in general is it possible to mix and match ocaml code compiled with the native opt compiler, with compiled bytecode ? Additionally would there be any caveats in trying to do this using this continuations package. Note that i have not yet got to functor/modules in my self-learning where i suspect the general aspect of this question would be explained but am trying to ancipate a bit in advance. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
[Caml-list] Re: Troublesome nodes
Hi, Dario Teixeira wrote: Ideally, one could do something like this: type seq_t = super_node_t list and nonlink_node_t = [ `Text of string | `Bold of seq_t ] and link_node_t = [ Mref of string * nonlink_node_t list | See of string ] and super_node_t = [nonlink_node_t | link_node_t] However, this fails with an error The type constructor nonlink_node_t is not yet completely defined. Jon Harrop suggested untying the recursive knot, but the solution has a few drawbacks of its own [2]. How about just define type seq_t = super_node_t list and nonlink_node_t = [ `Text of string | `Bold of seq_t ] and link_node_t = [ `Mref of string * nonlink_node_t list | `See of string ] and super_node_t = [`Test of string |`Bold of seq_t | `Mref of string * nonlink_node_t list | `See of string] or similar ... not sure whether this satisfies your requirements though. -- Zheng ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] thousands of CPU cores
Zitat von Peng Zang [EMAIL PROTECTED]: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Thursday 10 July 2008 11:01:31 pm Brian Hurt wrote: On Thu, 10 Jul 2008, Gerd Stolpmann wrote: I wouldn't take this article too seriously. It's just speculation. I would take the article seriously. Just open up your mind to this perspective: It's a big risk for the CPU vendors to haven taken the direction to multi-core. *Precisely*. It also stands in stark contrast to the last 50 or so years of CPU development, which focused around making single-threaded code faster. And, I note, it's not just one CPU manufacturer who has done this (which could be chalked up to stupid management or stupid engineers)- but *every* CPU manufacturer. And what do they get out of it, other than ticked off software developers grumbling about having to switch to multithreaded code? I think we can all agree that more computing units being used in parallel is going to be the future. The main point here is that a shared-memory architecture is not necessarily (and in my opinion doubtful) the approach that will be taken for large numbers of CPUs. [...] For example, if you have a non-profit research project, you can use the BOINC infrastructure, which provides about 58 PCs to help you :) http://en.wikipedia.org/wiki/Berkeley_Open_Infrastructure_for_Network_Computing There is no Shared-Mem as we know it from our local PCs, there is distributed calculation around the whole planet. Threads will not help there ;-) Ciao, Oliver ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Heaps size problems with caml_alloc_small in foreign function interfaces
Could you give me a pointer to information on root registration? What frustrates me is that this is CamlIDL generated code. Shouldn't it just work? Sean On 12/07/2008, at 0:11, Xavier Leroy [EMAIL PROTECTED] wrote: I'm having a problem where sometimes a call to caml_alloc_small from C results in a segmentation fault. If I increase the size of the stack using OCAMLRUNPARAM=s=1000k then I don't get the crash anymore. It seems strange that I have to increase the size of the heap manually like this. It's probably a root registration problem. These are very sensitive to the times when GC is triggered, which themselves are sensitive to the heap sizes and memory behavior of your program. If I want to increase the size of the heap in C how do I do this? Could I write a safe caml_alloc_small which first checks to see if there is enough memory and then increases the heap size if not? Don't try to hack around the real problem, but do make available a repro case, no matter how large, on a Web site or as attachment to a problem report on the bug tracking system, so that others can have a look at it. - Xavier Leroy ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs