Re: [MirageOS-devel] Cohttp Design -- LWT, Async, JS, Mirage Compatibility

Anil Madhavapeddy Sat, 28 Mar 2015 12:56:10 -0700

Indeed, these are very good approaches too.  There's a distinct tradeoff in all 
three examples though:


- ocaml-tls is a very complex state machine, and so the pure core dominates the 
Lwt driver.  After that though, there's no abstraction across (e.g.) Async or 
Lwt.  I'm not sure how well this would translate to HTTP, which is dominated by 
I/O logic (after all, there's a pure protocol core already).

- Despite being a heavy user of Daniel's libraries, I don't find them to have 
the most usable interfaces for quick usage, although they are the best 
documented and thought through from the libraries I select from (hence the 
existence of ezxmlm, ezjsonm, and so on as wrapper libraries).  A monadic 
interface is quite intuitive to pick up and use.

The compelling argument against functors for I/O is the greatest common divisor 
argument, although even this is (somewhat) mitigated by careful signature 
inclusion rather than massive interface types. This does create a burden on 
maintaining fine-grained interface types.

One aspect that really needs more investigation are the performance tradeoffs.  
It used to be that functors always introduced an extra indirection, but with 
Pierre Chambart's new inliner this is greatly mitigated.  The pressure now goes 
onto the GC for the purely functional code that will do a lot more allocation...

-anil

> On 28 Mar 2015, at 19:40, Nicolas Ojeda Bar <[email protected]> wrote:
> 
> Hi,
> 
> While I do not have any particular criticism of cohttp, there is another 
> approach to this problem that I personally find far more elegant and 
> flexible.  I will talk mostly of how to handle IO, but the same approach 
> works with any type of "layering".  Let us call it the "functor-less" 
> approach.
> 
> The main difference is that, while functors are used to parametrise the IO 
> operations used *in* the code (often by way of defining an enclosing monad), 
> the functor-less approach takes the IO completely *out* of the code.  Instead 
> IO actions are returned as values that guide the actions of particular 
> backends.  This involves formulating the sequencing of IO actions as a state 
> machine.  Rather than trying to explain the general setup, let me point to 
> some well-known examples, which I hope will convey the essence of this 
> approach:
> 
> - ocaml-tls (purely functional style)
> 
> <https://github.com/mirleft/ocaml-tls/blob/master/lib/engine.mli#L116 
> <https://github.com/mirleft/ocaml-tls/blob/master/lib/engine.mli#L116>>
> 
> (Lwt backend) 
> <https://github.com/mirleft/ocaml-tls/blob/master/lwt/tls_lwt.ml#L75 
> <https://github.com/mirleft/ocaml-tls/blob/master/lwt/tls_lwt.ml#L75>>
> 
> - D Buenzli's codecs (imperative signature), eg
> 
> <https://github.com/dbuenzli/uutf/blob/master/src/uutf.mli#L207 
> <https://github.com/dbuenzli/uutf/blob/master/src/uutf.mli#L207>>
> 
> (Unix backend) 
> <https://github.com/dbuenzli/uutf/blob/master/src/uutf.mli#L446 
> <https://github.com/dbuenzli/uutf/blob/master/src/uutf.mli#L446>>
> 
> Some advantages of this approach:
> 
> - using functors require following a greatest common divisor design - the 
> argument signature has to account for all relevant features of each backend, 
> whether they are relevant or not for a particular one. Similarly, adding a 
> new feature to a backend in the functorised approach often means giving an 
> interpretation of this feature to all other backends, even if it does not 
> make much sense.
> 
> - by construction it completely decouples IO and the program logic.  The 
> advantages of this with respect to clarity, error handling, safety, testing, 
> are hard to overestimate.
> 
> - do not have to write code in monadic style
> 
> Best wishes,
> Nicolas
> 
> On Sat, Mar 28, 2015 at 4:36 PM, Anil Madhavapeddy <[email protected] 
> <mailto:[email protected]>> wrote:
> On 28 Mar 2015, at 14:24, Trevor Smith <[email protected] 
> <mailto:[email protected]>> wrote:
>> 
>> Hi all,
>> 
>> I was wondering if there is a document somewhere describing why the 
>> different backends to cohttp don't have a unified client/server interface? 
>> It seems like it would be such a boon for the user to be able to write their 
>> code but be able to choose the backends. I realize that this must have 
>> already received much discussion but am not sure where it is located.
> 
> Hi Trevor,
> 
> There's no design document describing this, mainly because CoHTTP started as 
> an informal bet between me and Yaron Minsky that the current design would be 
> impossible/a bad idea.  The jury's still out on the verdict, but I don't 
> think I've lost yet :-)
> 
> I wanted to build an HTTP implementation as an "onion", with the portable 
> parsing core progressively introducing I/O, and then higher-level 
> abstractions for various HTTP operations.  Here's my description of each 
> layer (that eventually ought to go into a design doc in the CoHTTP repo to 
> make it more accessible to newcomers to the codebase):
> 
> - The very first layer (in `lib/`) is a pure OCaml, non-blocking layer that 
> handles simple parts of the HTTP protocol such as parsing requests and 
> responses, various header parsers (e.g. cookies) and codes.
> 
> - Some layers of HTTP need some notion of I/O, and so there is a set of 
> signatures in `lib/s.mli` that defines some common module types that can be 
> used to build parameterised modules (also known as functors).  The first one 
> used in the `lib/` layer is the IO module type, which defines the minimal 
> collection of functions used by cooperative threading libraries.  The pure 
> HTTP core uses this IO module to capture IO-based operations, such as 
> Transfer_IO (for transfer encoding).
> 
> - There are three implementations that satisfy the IO module in the tree: 
> Lwt, Async and String.  The first two are full cooperative threading 
> libraries, and the latter is used by the js_of_ocaml backend to read/write 
> between Strings.
> 
> - Now that IO has been handled, we can send HTTP requests and responses from 
> Lwt or Async.  However, at this point some differences appear in the 
> implementations of Async and Lwt, notably in how they handle cancellation of 
> threads and also higher-level iterators (e.g. Async has Pipes, and Lwt has 
> Lwt_stream -- both quite different).  Therefore, we build backend-specific 
> Client and Server modules that use their respective threading libraries in as 
> native a style as possible, but still reusing the core HTTP library from 
> `lib/`.  These can be found in `Cohttp_lwt` and `Cohttp_async` respectively.  
> Dave Scott also wrote an (as yet not merged) POSIX blocking version that they 
> use in the XenAPI daemon.
> 
> - Lwt comes with an additional twist -- it is portable to both Unix *and* the 
> MirageOS, which has no Unix at all!  Lwt makes it possible to define a "Lwt 
> core" that uses the portable Lwt thread abstractions, but doesn't use any 
> OS-specific functionality.  Thus we can define an HTTP Client and Server in 
> Cohttp_lwt, but still not tie ourself to one particular OS.  This Cohttp_lwt 
> is then used by the Cohttp_lwt_unix and Cohttp_mirage backends to hook it 
> into the operating system.
> 
> - There's no commonality at present between Cohttp_async and Cohttp_lwt, but 
> that's the topic of a design discussion at the moment.  It should be possible 
> to build a common signature between the two, and Rudi Grinberg took a shot at 
> this a while back.  I'm not sure that it's worth the trouble right now.
> 
> - Andy Ray did something interesting with the Lwt backend: he ported it to 
> JavaScript by implementing an IO backend that marshals the requests to and 
> from strings.  This allows REST API users built over Cohttp (such as 
> ocaml-github) to compile to pure JavaScript as well.
> 
> Drawbacks:
> 
> - The heavy use of functors does make it hard to navigate the 'end user' API, 
> even though those interfaces never expose any functors (for instance, you 
> just use Cohttp_lwt_unix directly in most cases). This is a drawback of 
> current OCaml tooling, and Merlin (for IDEs) and Codoc (for cross-referenced 
> documentation) will fix this soon.
> 
> - A bigger problem that needs to be addressed in Cohttp2 is body handling, 
> which we basically got wrong in this iteration.  The Body module is not 
> idempotent, so to_string does not always return the same value if called 
> multiple times.  The caller can currently be careful, but this is just an 
> awful part of the API.  There are enough users of Cohttp that we'll leave it 
> for 1.0, but hopefully fix it quite rapidly for 2.0.
> 
> - Cohttp is not a complete HTTP client, and doesn't implement the full logic 
> for redirections, loop detection and so on.  That's the job of a library 
> built over it, and there is some nascent code in opam-mirror that can do this 
> [1].  Before building this, David Sheets and I want to look at some of the 
> more larger API clients built using it (such as Vincent Bernardoff's BitStamp 
> API [2]) and take a shot at a portable client API that will work with both 
> Lwt and Async.
> 
> So was functorising this heavily such a good idea?  I think so -- the litmus 
> test is whether or not there is more than one different implementation for 
> each parameterised module, and this has worked out particularly well for the 
> Cohttp_lwt backend, where there are now 4 (!) very different implementations.
> 
> Hope this helps,
> Anil
> 
> [1] 
> https://github.com/avsm/opam-mirror/blob/master/opam_mirror_fetch_urls.ml#L49 
> <https://github.com/avsm/opam-mirror/blob/master/opam_mirror_fetch_urls.ml#L49>
> [2] https://github.com/vbmithr/ocaml-bitstamp-api 
> <https://github.com/vbmithr/ocaml-bitstamp-api>
> _______________________________________________
> MirageOS-devel mailing list
> [email protected] 
> <mailto:[email protected]>
> http://lists.xenproject.org/cgi-bin/mailman/listinfo/mirageos-devel 
> <http://lists.xenproject.org/cgi-bin/mailman/listinfo/mirageos-devel>
> 
>

_______________________________________________
MirageOS-devel mailing list
[email protected]
http://lists.xenproject.org/cgi-bin/mailman/listinfo/mirageos-devel

Re: [MirageOS-devel] Cohttp Design -- LWT, Async, JS, Mirage Compatibility

Reply via email to