Hello, Here is the latest Caml Weekly News, for the week of May 11 to 18, 2010.
1) OCaml-Java project: 1.4 release 2) Camomile 0.7.3 3) OCaml S3 Library? 4) Phantom types 5) Binding and evaluation order in module/type/value languages 6) Other Caml News ======================================================================== 1) OCaml-Java project: 1.4 release ------------------------------------------------------------------------ ** Continuing an old thread, Warren Harris asked and Xavier Clerc replied: > I'm curious whether there are any notes / pointers regarding the > completeness of the ocaml-java implementation (couldn't find this on the web > site). I'm wondering about the feasibility of using it for a moderately large > ocaml project I've been working on which uses Lwt to perform async I/O. I > assume that for this to work with ocaml-java, the lowest levels of Lwt would > need to be adapted to use NIO or threads in order to run on a JVM. Also my > application is written in a pure functional style, and relies heavily on > closures, currying, recursion and the ability for the compiler to do tail call > optimization. I'm concerned that this will not translate well to Java. Here is some information about the OCaml-Java project (both current 1.4 release and future 2.0 release). Regarding compatibility with the original implementation of OCaml: - ocamljava is able to compile working versions of ocamlc, ocamllex, menhir and itself, giving some confidence in its language-compatibility level; - all the libraries bundled with the original implementation have been translated and tested through some unit testing (but thread libraries tests are only lightweight, labltk support is based on the swank project [1], and unix is only partially supported); - document [2] gives per-primitive compatibility information; - the last page of [3] lists the differences between ocamlc/ocamlopt and ocamljava compilers; - third-party libraries (such as Lwt) should be supported out-of-the-box as long as they do neither rely on exotic features (remember: "Obj.magic is not part of the OCaml language"), nor on C code (unless you can translate it to Java code). Regarding performances in current version: - closures are based upon Java reflection (will be based upon JDK 7's method handles, avoiding access checks on each call); - only "explicit self" tail calls are optimized ("self" meaning same function, and "explicit" meaning "not through a closure"); - performances are terrible and memory consumption is excessive. Bottom line: the current version should be regarded as a proof-of-concept. However, I have spent recent vacation on the upcoming 2.0 version, with some positive results. The runtime has been partially rewritten, reducing memory consumption and increasing performances. The Java equivalent of "ocamlrun" is now within an order of magnitude from the original tool on all tested programs, and only about 4 times slower on some (e. g. "nbody" from the language shootout [4]). This seems to indicate that -when updated- the ocamljava compiler should be able to at least match ocamlc performances. Notice that is a conservative statement / objective if you consider that Scala is on par with ocamlopt on some benchmarks (but the Scala language has been designed from the bottom up to be run on the JVM). A final word about tail call optimization. As previously said, only "explicit self" calls are optimized (as far as I know, this is also true for mature compilers like Scala or Clojure). This will not change in future versions, unless Sun / Oracle adds support for tail calls in the JVM [5] [6]. I know that some people will regard a compiler for a functional language with no general tail call optimization as "broken", but it is clear that OCaml-Java will not perform program transformations (such as trampolining) to enable tail call optimization in every cases. My understanding may be incorrect, but it seems that such transformations result in obfuscated code (hindering HotSpot JIT compilation) and unpredictable performances. Some points above may need to be elaborated, but this very message is already quite long... Anyway, feel free to ask for further information. Regards, Xavier Clerc [1] <http://www.onemoonscientific.com/swank/summary.html> [2] <http://cadmium.x9c.fr/distrib/cadmium-compatibility.pdf> [3] <http://cafesterol.x9c.fr/distrib/cafesterol.pdf> [4] <http://shootout.alioth.debian.org/> [5] <http://openjdk.java.net/projects/mlvm/subprojects.html#TailCalls> [6] <http://wikis.sun.com/display/mlvm/TailCalls> ======================================================================== 2) Camomile 0.7.3 Archive: <http://groups.google.com/group/fa.caml/browse_frm/thread/91ded3b129ac8323#> ------------------------------------------------------------------------ ** Yoriyuki Yamagata announced: I'm pleased to announce Camomile 0.7.3, a new version of Camomile, a comprehensive Unicode library for OCaml. This is a bug fix release. It fixes the following bugs and Camomile now works on Windows. - Aliases of character encodings containing ":" are removed, to support Windows platform. - Buffering bugs in CharEncoding and OOChannel modules. - Tree-merging bugs of ISet and IMap. - Locale data are properly loaded by binary channels. (Windows related) - "make depend" properly generates .depend file. - cpp is no longer required to build from the distribution. The license documentation for locales/*.txt files is added. (locales/license.html) The package is tested on Windows (MinGW-port of OCaml 3.11.0 + Cygwin on Vista SP1) and Linux (Ubuntu 8.04 + godi version of OCaml 3.11.2). You can download the package from <https://sourceforge.net/projects/camomile/> For more information on Camomile, see the project web page <http://camomile.sourceforge.net/>. I would appreciate your comments or/and opinions. In particular, I'd like to know whether you can successfully operate Camomile in your platform. I have complaints on Mac and MinGW environments, and although I believe that the problems are fixed, I'm too lazy to find spare Mac around and test the package :-) Also, I would like to hear about a success ( /failure ) story of Camomile. Do you use Camomile? What for? This is important since I have to convince my boss to allow me to invest some spare time to Camomile project. ** Romain Beauxis replied: Thanks for your work in camomile ! We have been using and enjoying it with liquidsoap [1] for probably almost 3 years now. We are using camomile to convert implicitely the various metadata tags into unicode. I am personally impressed by how good the module works. As far as I can tell we have had no issues with character encoding since we are using camomile ! Romain [1]: <http://savonet.sf.net/> ** Sylvain Le Gall said and Yoriyuki Yamagata replied: > I use camomile as the default implementation for ocaml-gettext: > <http://forge.ocamlcore.org/projects/ocaml-gettext> > > Basically I do encoding conversion with it. Projects of mine built with > ocaml-gettext use camomile as well. > > Regards > Sylvain > > ps: as asked on the BTS, a way to be able to relocate the .mar file > would be something great -- this actually prevents me to distribute my > projects built with camomile. (relocate -> remove the hardcoded path) Hmm... you should be able to relocate .mar file by providing the location of .mar files through the functor CamomileLibrary.Main.Make. If you do not link CamomileLibrary.Default, Camomile won't load the data from the hardcoded path. If it does, then I regards it as a bug. ** Xavier Clerc said: Camomile is used in Barista [1] to encode / decode / modify UTF8 strings of Java classes. My opinion of Camomile is absolutely positive: - installation is a breeze; - API is complete yet straightforward; - it meets a need of the OCaml ecosystem. Not yet updated to 0.7.3 but 0.7.2 used to work like a charm on both MacOS X and Fedora. Regards, Xavier Clerc [1] library (part of OCaml-Java) to load / manipulate / save Java ".class" files - <http://barista.x9c.fr/> ** Dmitry Bely asked and Yoriyuki Yamagata replied: > How "heavy-weight" is Camomile? I was a bit scared with the size of > its distribution. Currently I use under Windows the following my own > simple Unicode-support module (implemented via > WideCharToMultiByte/MultiByteToWideChar Win32 API functions). Maybe > it's time to switch to Camomile? The size of the package is due to mapping tables of character encoding and localization data. they occupy several mega bytes on the disk but it is nothing by today's standard. If you still care, you can delete any .mar files in charmaps, locales, mappings directory. (Deleting source files in these directory is not recommended, since it could cause a failure of compilation.) If you delete such files, related encoding and locales do not function, but other functionality is intact. As for memory consumption, character mapping tables and localization data are loaded to the memory when they are required. So if you do not use them, they are dead on the disk. After being loaded to the memory, they are retained by weak hashtable, therefore, when they are no longer used, GC releases them. Some modules load data from database directory during initialization. They persists the end of the execution, but these data are usually small (mostly some kilobytes in the marshaled form, only allkey.mar which used in uCol module is >100k bytes (*)). You can avoid loading them by not linking these modules. Finally, the programming style to use Camomile would be a bit heavy, since Camomile uses functors a lot. In fact, the library itself is a functor which takes modules containing initialization parameters. Heaviness aside, Camomile does not provide OS integration, which might be a problem for you. I tried POSIX integration but abandoned it by the reason which I no longer remember :-) But, you can use UTF16 module to access wch array, since Windows wch is a 16-bit integer. (UTF16.t is 16-bit integer array implemented in bigarray module. (*) I noticed allkey.mar can be retained by weak pointer as well (currently, it is held in Lazy.t). I will fix it in the next release, maybe. ======================================================================== 3) OCaml S3 Library? Archive: <http://groups.google.com/group/fa.caml/browse_frm/thread/fc851484522fad6c#> ------------------------------------------------------------------------ ** Daniel Patterson asked and Jerome Vouillon replied: > Has anyone written a library to interact with amazon web services > (specifically, S3)? I have started writing a library for S3. A large part of the API is implemented. What is missing is mostly dealing with metadata and error handling. You can find the current sources here: <http://www.pps.jussieu.fr/~vouillon/S3.tar.gz> If anyone is interested to contribute, please contact me. I will then set up a shared repository. ======================================================================== 4) Phantom types Archive: <http://groups.google.com/group/fa.caml/browse_frm/thread/0df560ee78e0f75f#> ------------------------------------------------------------------------ ** Thomas Braibant asked and Goswin von Brederlow replied: > Following on the thread "Subtyping structurally-equivalent records, or > something like it?", I made some experimentations with phantom types. > Unfortunately, I turns out that I do not understand the results I got. > > Could someone on the list explain why the first piece of code is well > typed, while the second one is not ? > > > type p1 > type p2 > > type 'a t = float > let x : p1 t = 0.0 > let id : p2 t -> p2 t = fun x -> x > let _ = id x (* type checks with type p2 t*) This is actualy a problem, at least for me. Because that is a type error you usualy want to specifically catch through the use of phantom types. But ocaml knows that 'a t = float and all floats are compatible. So it accepts all 'a t as the same. To make the phantom type checking work for you you need to move the definition of your phantom type into a submodule and make the type abstract through its signature. Any functions converting from one 'a t to 'b t also need to be in there. To avoid having to use the submodule name all the time you can use something like module M : sig type 'a t = private float val make : float -> 'a t end = struct type 'a t = float let make f = f end include M # let x : p1 t = make 0.0;; val x : p1 t = 0. # let id : p2 t -> p2 t = fun x -> x;; val id : p2 t -> p2 t = <fun> # let _ = id x;; Error: This expression has type p1 t = p1 M.t but an expression was expected of type p2 t = p2 M.t The "private" tells the type system that nobody (outside the module) is to create a value of that type. Only inside the module, where the type isn't private can you create one. > type 'a t = {l: float} > let x : p1 t = {l = 0.0} > let id : p2 t -> p2 t = fun x -> x > let _ = id x (* ill typed *) Why it works correctly here is explained nicely in the other mailss in this thread. ** Rabih Chaar also replied: if you define the intermediate type type s= {l:float} followed by type 'a t = s everything goes well I am unable to give you an explanation about this (the need of the intermediay type s). I hope someone can shed some light on this. ** Philippe Veber suggested: It seems that the expressions typecheck when t is a type abbreviation and not a type definition. I don't know about the actual typing rules but this would be reasonable, I guess. ** Dawid Toton also replied: I think the crucial question is when new record types are born. Here is my opinion: The "=" sign in the above type mapping definition is what I would call "delayed binding". "Early binding" would be equivalent to type tmp = {lab : float} type 'a s = tmp (evaluate the right-hand side first, then define the mapping). The "early binding" creates only one record type, so lab becomes ordinary record label. In the given example of the "delayed binding" the t becomes a machine producing new record types. Hence, the identifier l is not an ordinary record label. It is shared by whole family of record types. We can see it this way: # type 'a t = { la : float } ;; type 'a t = { la : float; } # {la = 0.};; - : 'a t = {la = 0.} So OCaml interpreter doesn't know the exact type of the last expression, but it is clever enough to give it a generalized type. We can use la to construct records of incompatible types: # type 'a t = { la : float } ;; type 'a t = { la : float; } # let yy = ({la = 0.} : int t) ;; val yy : int t = {la = 0.} # let xx = ({la = 0.} : string t);; val xx : string t = {la = 0.} # xx = yy;; Error: This expression has type int t but an expression was expected of type string t I suppose my jargon may be not mainstream, apologies. ======================================================================== 5) Binding and evaluation order in module/type/value languages Archive: <http://groups.google.com/group/fa.caml/browse_frm/thread/96dd3e1479c2c6bc#> ------------------------------------------------------------------------ ** Dawid Toton asked and Jacques Garrigue replied: > Thinking about the discussion in the recent thread "Phantom types" [1], > I have created the following piece of code that aims to demonstrate > binding and evaluation order that takes effect in all three levels of OCaml. > > My question is: what are the precise rules is the case of type language? > I have impression that it is lazy and memoized evaluation. But this my > guess looks suspicious. > > I don't intend this question to be about inner working of the compiler, > but about the definition at the conceptual level. It is not clear that this will work at all. The semantics of ocaml (contrary to SML) is not defined in terms of type generativity. > (* 1. Module language; side effect = create fresh record type; test = > type equality test *) > > module type T = sig type t end > module R (T:T) = struct type r = {lab : int} end > > module TF = struct type t = float end > module TS = struct type t = string end > module R1 = R(TF) > module R2 = R(TF) > module R3 = R(TS) > > let test12 (k : R1.r) (l : R2.r) = (k=l) (* pass => R1.r = R2.r *) > let test13 (k : R1.r) (l : R3.r) = (k=l) (* pass => R1.r = R3.r *) > > (* Conclusion: RHS evaluated at the mapping definition point *) Wrong: test13 fails. OCaml functors are applicative, and type equality is the equality of paths. Stranger: defining twice TF, or defining an alias for TF, will result in different types. module TF' = TF module R4 = R(TF') let f (x : R1.r) = (x : R4.r);; Error: This expression has type R1.r = R(TF).r but an expression was expected of type R4.r = R(TF').r So your comment for the type language applies here: > (* Conclusion: RHS evaluated some time after the mapping is applied; > sort of memoization at the conceptual level *) I'm not sure there is a timing here, but there is certainly some form of memoization, which happens to be done by name. > (* 2. Type language; side effect = create fresh record type; test = type > equality test *) > > type 't r = {lab : int} > > type tf = float > type ts = string > type r1 = tf r > type r2 = tf r > type r3 = ts r > > let test12 (k : r1) (l : r2) = (k=l) (* pass => r1 = r2 *) > let test13 (k : r1) (l : r3) = (k=l) (* fail => r1 ? r3 *) > > (* Conclusion: RHS evaluated some time after the mapping is applied; > sort of memoization at the conceptual level *) Lots of people seem to think that generativity of type definitions gives you phantom types. This is true in Haskell, but not in ocaml. # let f x = (x : r1 :> r3);; val f : r1 -> r3 = <fun> So r1 and r3 are not strictly equal, but they are equivalent from the point of view of subtyping. Please do not use this approach for phantom types, except if you want the ability to forget the argument where desired. As written in other mails, the right approach for phantom types is to use private types, as you can control the variance of their parameters. Conclusion: in the type language, this is more like you thought at first for functors, i.e. RHS evaluated at the mapping definition point. But if the RHS is abstract/private or includes the type parameters, then they are involved in deciding the equality (structurally, not through generativity). > (* 3. Value language; side effect = create fresh int; test = value > equality test *) > let r t = Oo.id (object end) > > let tf = 0. > let ts = "A" > let r1 = r tf > let r2 = r tf > let r3 = r ts > > let test12 = assert (r1 = r2) (* fail => r1 ? r2 *) > let test13 = assert (r1 = r3) (* fail => r1 ? r3 *) > > (* Conclusion: RHS evaluated exactly at the point of mapping application *) Sure, this is an eager language. > [1] > <http://groups.google.com/group/fa.caml/browse_thread/thread/0df560ee78e0f75f#> ======================================================================== 6) Other Caml News ------------------------------------------------------------------------ ** From the ocamlcore planet blog: Thanks to Alp Mestan, we now include in the Caml Weekly News the links to the recent posts from the ocamlcore planet blog at <http://planet.ocamlcore.org/>. Asyncore: <https://forge.ocamlcore.org/projects/asyncore/> Camomile 0.7.3: <http://caml.inria.fr/cgi-bin/hump.cgi?contrib=85> Froc 0.2: <http://caml.inria.fr/cgi-bin/hump.cgi?contrib=692> Observable sharing and executable proofs: <http://alaska-kamtchatka.blogspot.com/2010/05/observable-sharing-and-executable.html> mpfi: <https://forge.ocamlcore.org/projects/mpfi/> ======================================================================== Old cwn ------------------------------------------------------------------------ If you happen to miss a CWN, you can send me a message (alan.schm...@polytechnique.org) and I'll mail it to you, or go take a look at the archive (<http://alan.petitepomme.net/cwn/>) or the RSS feed of the archives (<http://alan.petitepomme.net/cwn/cwn.rss>). If you also wish to receive it every week by mail, you may subscribe online at <http://lists.idyll.org/listinfo/caml-news-weekly/> . ======================================================================== _______________________________________________ caml-news-weekly mailing list caml-news-weekly@lists.idyll.org http://lists.idyll.org/listinfo/caml-news-weekly