Re: [Caml-list] JIT & HLVM, LLVM
2009/9/27 Jon Harrop : > On Sunday 27 September 2009 20:23:13 kche...@math.carleton.ca wrote: >> If Grand Central Dispatch makes its way >> into *nix... > > Incidentally, what exactly (technically) is Grand Central and how does it > relate to existing alternatives and the state-of-the-art? I Googled it but > failed to find anything useful... > Grand Central Dispatch (GCD, where libdispatch is client side library) is a very simple piece of code (with some non-trivial implementation details). Some would say it is just a thread queue, but it is quite clever the same way map-reduce is clever and simple. It may help to read the Ars Technica article first, since I dig a little below the user level api. Two important concepts: GCD does not require threads, and it does not require kernel support. When present, GCD becomes efficient at handling multiple cores - or this is the claim, and at least OS X 10.6 Snow Leopard seems to be much smoother than OS X 10.5 Leopard, so I guess there is something to it. But without threads and kernel support, it is still a good programming model, it would seem (haven't burned my fingers yet...) To better understand GCD, think of a single threaded event driven application with support for nested event queues. We start with a single thread, later extend the concept to multiple threads and cores: There is set of global queues with different priorities, one which is the default. Since we assume single threaded for now, we limit this to only one global queue. Tasks are pushed onto the global queue. A task is just function with data, i.e. a closure. In C this is a function pointer and a void * to context, or it is block if you have the blocks extension, but the blocks extension really is optional, although very useful since it can take an standard C block and with a little syntax turn it into a closure with a copy of the parent scope state which can then quickly be turned into a parallel loop. But this is just syntax, really not too important for the underpinnings of libdispatch. A function can push other functions to the global queue. This in effect creates a continuation, and the concept is similar to continuation passing style or LWT threads, but requires no continuation arguments to drag around. A task should return soon, and avoid blocking operations - especially in single threaded setup, but generally blocking is bad in GCD. The interesting part is how this gets scheduled: There is no central scheduler. When our current task function returns (which it should do soon), there is another piece of calling code that earlier picked our task from a queue. Upon task return this code now picks the next task on the queue and calls it - ad infinitum until we reach an empty queue, in which case our control code can exit - i.e. exit the thread. So far this your standard http web server event loop. Now we add a thread pool to the queue. This means that N threads are all either executing a task from the queue, or actively executing a task on the queue. One a task completes, the next task is popped. Since there are N threads, we have contention on the queue, so it must be briefly locked with compare-exchange logic. Now we have a concurrent queue with the semantic that tasks are started in order, but execute concurrently. Any and all synchronization between tasks must be managed by client code if they access shared resources. Which brings us to the next point: A task (or the main program thread) can create a new serial queue. A serial queue is really just a task pushed onto the default global queue, but this is mostly invisible to the user, but internally this makes the control very simple and elegant (thread local storage is used to identify the current parent queue in the queue creation function). Our active task can now push new tasks onto the serial queue. These tasks are guaranteed to be isolated. The serial queue tasks will eventually be popped from the global concurrent queue either after our function exits, or sooner if there or other vacant threads in the pool. Once the serial queue task is popped, it starts popping tasks of the queue, while still locking pop operations to avoid conflict with new pushes. Note that every single thread cooperate on scheduling by popping the next task from the queue it is currently assigned to. When a task is a queue task, it will temporarily change the thread task to the new queue until that queue is exhausted, then the thread returns focus to the previous queue. A task can choose to wait on the completion of another task it pushes, in which case it is easy to deadlock if not being careful. It is possibly to create an insanely huge amounts of queues in this system. The only real overhead is the costs of memory barriers used when pushing and popping. Apple refers to this as islands of serialization in a sea of concurrency. There is also the event signalling subsystem that grabs descriptor events and holds a task until signalled, then
Re: [Caml-list] xpath or alternatives
In line with what Yaron suggests, you can use a combinator parser. I do this to parse json, and this parser could be adapted to xml by focusing on basic syntax and ignoring the details, or you could prefilter xml and use the json parser directly. See the Fleece parser embedded here: There is also the object abstraction that dives into an object hierarchy after parsing, see the Objects module. The combination of these two makes it quite easy to work on structured data, but 3 lines only come after some xml adaptation work - but you can see many one-liner json access in the last part of the file. http://git.dvide.com/pub/symbiosis/tree/myocamlbuild_config.ml Otherwise there is xmlm which is self-contained in single xml file, and as I recall, has some sort of zipper navigator. (I initially intended to use it before deciding on the json format): http://erratique.ch/software/xmlm ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] ANN: pa_polyrec: syntax for polymorphic recursion
Thanks for the comments, Damien. Damien Guichard wrote: Problems are : 1. functions are harder to type 2. type errors are quite hard to interpret and to locate (although *pa-polyrec* certainly helps in this department) 3. benchmarks show you pay a certain (small) price for the added type expressivity I think that the arguments you make against polymorphic recursion could also be made against other features of OCaml, such as structurally-typed objects and polymorphic variants: they make type errors more complex and can be less efficient than the alternatives. Still, I'm glad that OCaml has those features because they make it possible to write programs that would otherwise be impossible. I'm glad you raised these points, though, because they highlight the fact that there is a tradeoff involved. If you use nested data types and polymorphic recursion then you may have to work harder to satisfy the compiler. However, in return the compiler will guarantee the absence of certain errors that would otherwise remain undetected until runtime, and perhaps forever. May be polymorphic recursion is actually a tool for the working ML programmer. Yet i am still waiting to see the exemple that is more than a nice trick for the ML hacker. Here is one example: the following paper shows how to use OCaml extended with GADTs to give a precise type to the output of a parser generator: Towards efficient, typed LR parsers François Pottier and Yann Régis-Gianas ACM Workshop on ML, 2006 http://gallium.inria.fr/~fpottier/publis/fpottier-regis-gianas-typed-lr.pdf The run and gotoE procedures, which form a central part of the implementation described in that paper, are inherently polymorphic-recursive, just like most other non-trivial functions over GADTs. OCaml doesn't have GADTs, of course, but Oleg recently described a technique that can be used to translate many programs that use GADTs into OCaml: http://caml.inria.fr/pub/ml-archives/caml-list/2009/07/2984f23799f442d0579faacbf4e6e904.en.html Polymorphic recursion is an important element of that technique. In my opinion, when wants the added type expressivity he actually wants dependant types. That may be true, but since OCaml doesn't support dependent types I think that it's useful to know what can be accomplished without them. There is an ancillary benefit to pa_polyrec (besides polymorphic recursion): it provides a clear way of ensuring that a function is polymorphic, giving a solution to the problem described in this blog post by Stephen Weeks: http://ocaml.janestreet.com/?q=node/29 For example, suppose that you have a function let f = fun x -> 13 :: x and you want to tell the compiler that it has type 'a list -> 'a list, for any type 'a. (Since it actually does *not* have that type, we want the compiler to complain.) This is surprisingly difficult to write straightforwardly in OCaml! You can't write the following let f : 'a list -> 'a list = fun x -> 13 :: x since OCaml will simply unify the variable 'a with the element type of the list, and type-checking will succeed. That blog post shows how to use an uninhabited type to guarantee polymorphism: you can write something like the following type a let f = fun x -> 13 :: x let (_ : a list -> a list) = f and the compiler will duly complain that 'a' and 'int' cannot be unified. This is certainly a useful trick; still, it would be nice to be able to ascribe the polymorphic signature directly. Using pa_polyrec one can write let rec f : 'a. 'a list -> 'a list = fun x -> 13 :: x and the compiler will complain that the type of the right hand side is insufficiently general. (Of course, it would be better to be able to omit the 'rec'; perhaps a future release will permit that.) ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Incremental linking
Am Dienstag, den 29.09.2009, 20:39 +0200 schrieb Dawid Toton: > I have lot of modules and they are compiled to native code. > So I have .cmx and .o files and want to link them faster. Well, you can link several .cmx files (and their accompanying .o files) to a .cmxa file (and an accompanying .a file): ocamlopt -a You cannot do the same again on the next level, i.e. link several .cmxa/.a together to get another .cmxa/.a. (I don't remember why this restriction exists.) Gerd > > Is is possible to make linking an associative operation acting on modules? > > I would like to do something like the following: > Knowing the correct partial order of modules (that compiler requires) I > can create a tree that preserves that order. Leafs are modules. Other > nodes of the tree correspond to a result of linking all descendant > modules. Modules that are frequently recompiled are placed closer to the > root. This way I expect to execute less linking operations during > development. > > Documentation of ld says that files produced with --relocatable can be > used as intermediate partially linked files. Can something like this be > done with object code produced by ocamlopt? > I don't know ocaml-specific details of linking, so maybe I overlook some > obvoius obstacle? > > Dawid > > ___ > Caml-list mailing list. Subscription management: > http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list > Archives: http://caml.inria.fr > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs > -- Gerd Stolpmann, Bad Nauheimer Str.3, 64289 Darmstadt,Germany g...@gerd-stolpmann.de http://www.gerd-stolpmann.de Phone: +49-6151-153855 Fax: +49-6151-997714 ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
[Caml-list] Incremental linking
I have lot of modules and they are compiled to native code. So I have .cmx and .o files and want to link them faster. Is is possible to make linking an associative operation acting on modules? I would like to do something like the following: Knowing the correct partial order of modules (that compiler requires) I can create a tree that preserves that order. Leafs are modules. Other nodes of the tree correspond to a result of linking all descendant modules. Modules that are frequently recompiled are placed closer to the root. This way I expect to execute less linking operations during development. Documentation of ld says that files produced with --relocatable can be used as intermediate partially linked files. Can something like this be done with object code produced by ocamlopt? I don't know ocaml-specific details of linking, so maybe I overlook some obvoius obstacle? Dawid ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
[Caml-list] Incremental linking
I have lot of modules and they are compiled to native code. So I have .cmx and .o files and want to link them faster. Is is possible to make linking an associative operation acting on modules? I would like to do something like the following: Knowing the correct partial order of modules (that compiler requires) I can create a tree that preserves that order. Leafs are modules. Other nodes of the tree correspond to a result of linking all descendant modules. Modules that are frequently recompiled are placed closer to the root. This way I expect to execute less linking operations during development. Documentation of ld says that files produced with --relocatable can be used as intermediate partially linked files. Can something like this be done with object code produced by ocamlopt? I don't know ocaml-specific details of linking, so maybe I overlook some obvoius obstacle? Dawid ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] ANN: pa_polyrec: syntax for polymorphic recursion
Hi Jeremy, Your syntax extension is nice and simple enough. It will really help in demystifying polymorphic recursion. The provided examples are particularly usefull in understanding where polymorphic recursion matters. My opinion is that polymorphic recursion is largeley overestimated and has little real-life use in a ML context. Basically there are two usage patterns : the Okasaki/Paterson style the J.J.Hallett & A.J.Kfoury style The J.J.Hallett & A.J.Kfoury style is just plain misuse of the rec keyword. That is instead of a wrong rec scope : let rec double : 'a . ('a -> 'a) -> 'a -> 'a = fun f y -> f (f y) and foo : 'x . int -> int = fun v -> double (fun x -> x + 1) v and goo : 'y . float -> float = fun w -> double sqrt w in (foo 3, goo 16.0) One shoud use a correct rec scope : let rec double f y = f (f y) in let foo v = double (fun x -> x + 1) v and goo w = double sqrt w in (foo 3, goo 16.0) Especially, OCaml beginner should be warned that polymorphic recursion is not a solution to their typing problems. The Okasaki/Paterson style is more insteresting. Basically it's a poor man's dependant types. Instead of : type 'a perfect = Leaf of 'a | Node of 'a perfect * 'a perfect One does write : type 'a perfect = Zero of 'a | Succ of ('a * 'a) perfect Then the compiler can statically check that the tree is always complete (perfectly balanced). Problems are : functions are harder to type type errors are quite hard to interpret and to locate (although pa-polyrec certainly helps in this department) benchmarks show you pay a certain (small) price for the added type expressivity In my opinion, when wants the added type expressivity he actually wants dependant types. I mean one should prefer the following Coq code, that is dependant types rather than nested data types : Variable A : Set. Inductive perfect : nat -> Set := | Leaf : perfect O | Node : forall n, A -> perfect n -> perfect n -> perfect (S n). The de Bruijn exemple is representative enough : either you seek after efficiency then nested data type will just hinder interpreter performance either you seek after interpreter validation then dependant types is the way to go May be polymorphic recursion is actually a tool for the working ML programmer. Yet i am still waiting to see the exemple that is more than a nice trick for the ML hacker. Hoping i have made the matter less esoteric to OCaml beginners, Regards, - damien Damien Guichard 2009-09-29 En réponse au message de : Jeremy Yallop du : 2009-09-28 22:56:32 À : caml-l...@inria.fr CC : Sujet : [Caml-list] ANN: pa_polyrec: syntax for polymorphic recursion I'm pleased to announce the initial release of pa_polyrec, a syntax extension for polymorphic recursion in OCaml. https://forge.ocamlcore.org/projects/pa-polyrec/ There are several methods for encoding polymorphic-recursive functions in OCaml; this extension allows them to be written directly, using a natural syntax. For example, given the following type of perfectly-balanced trees we might wish to write a function for summing the leaves. type 'a perfect = Zero of 'a | Succ of ('a * 'a) perfect In standard OCaml such a function can be written as follows: let sump f = (object (o) method sump : 'a. ('a - > int) - > 'a perfect - > int = fun f - > function | Zero x - > f x | Succ x - > o#sump (fun (a, b) - > f a + f b) x end)#sump f let sum_perfect = sump id Using pa_polyrec one can write the function in the following less obfuscated style: let rec sump : 'a. ('a - > int) - > 'a perfect - > int = fun f - > function | Zero x - > f x | Succ x - > sump (fun (a, b) - > f a + f b) x let sum_perfect = sump id Note that the type variable 'a in the type of the function is quantified: this is what differentiates polymorphic-recursive functions from standard OCaml recursive function definitions. More complex usage is supported, including mutual recursion. A number of examples are included in the distribution, including several modules from Chris Okasaki's thesis "Purely Functional Data Structures" and code from Richard Bird and Ross Paterson's paper "de Bruijn notation as a nested datatype". ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs