Re: [MirageOS-devel] Cohttp Design -- LWT, Async, JS, Mirage Compatibility

Nicolas Ojeda Bar Sat, 28 Mar 2015 12:42:06 -0700

Hi,

While I do not have any particular criticism of cohttp, there is another
approach to this problem that I personally find far more elegant and
flexible.  I will talk mostly of how to handle IO, but the same approach
works with any type of "layering".  Let us call it the "functor-less"
approach.


The main difference is that, while functors are used to parametrise the IO
operations used *in* the code (often by way of defining an enclosing
monad), the functor-less approach takes the IO completely *out* of the
code.  Instead IO actions are returned as values that guide the actions of
particular backends.  This involves formulating the sequencing of IO
actions as a state machine.  Rather than trying to explain the general
setup, let me point to some well-known examples, which I hope will convey
the essence of this approach:

- ocaml-tls (purely functional style)

<https://github.com/mirleft/ocaml-tls/blob/master/lib/engine.mli#L116>

(Lwt backend) <
https://github.com/mirleft/ocaml-tls/blob/master/lwt/tls_lwt.ml#L75>

- D Buenzli's codecs (imperative signature), eg

<https://github.com/dbuenzli/uutf/blob/master/src/uutf.mli#L207>

(Unix backend) <
https://github.com/dbuenzli/uutf/blob/master/src/uutf.mli#L446>

Some advantages of this approach:

- using functors require following a greatest common divisor design - the
argument signature has to account for all relevant features of each
backend, whether they are relevant or not for a particular one. Similarly,
adding a new feature to a backend in the functorised approach often means
giving an interpretation of this feature to all other backends, even if it
does not make much sense.

- by construction it completely decouples IO and the program logic.  The
advantages of this with respect to clarity, error handling, safety,
testing, are hard to overestimate.

- do not have to write code in monadic style

Best wishes,
Nicolas

On Sat, Mar 28, 2015 at 4:36 PM, Anil Madhavapeddy <[email protected]> wrote:

> On 28 Mar 2015, at 14:24, Trevor Smith <[email protected]>
> wrote:
>
>
> Hi all,
>
> I was wondering if there is a document somewhere describing why the
> different backends to cohttp don't have a unified client/server interface?
> It seems like it would be such a boon for the user to be able to write
> their code but be able to choose the backends. I realize that this must
> have already received much discussion but am not sure where it is located.
>
>
> Hi Trevor,
>
> There's no design document describing this, mainly because CoHTTP started
> as an informal bet between me and Yaron Minsky that the current design
> would be impossible/a bad idea.  The jury's still out on the verdict, but I
> don't think I've lost yet :-)
>
> I wanted to build an HTTP implementation as an "onion", with the portable
> parsing core progressively introducing I/O, and then higher-level
> abstractions for various HTTP operations.  Here's my description of each
> layer (that eventually ought to go into a design doc in the CoHTTP repo to
> make it more accessible to newcomers to the codebase):
>
> - The very first layer (in `lib/`) is a pure OCaml, non-blocking layer
> that handles simple parts of the HTTP protocol such as parsing requests and
> responses, various header parsers (e.g. cookies) and codes.
>
> - Some layers of HTTP need some notion of I/O, and so there is a set of
> signatures in `lib/s.mli` that defines some common module types that can be
> used to build parameterised modules (also known as functors).  The first
> one used in the `lib/` layer is the IO module type, which defines the
> minimal collection of functions used by cooperative threading libraries.
> The pure HTTP core uses this IO module to capture IO-based operations, such
> as Transfer_IO (for transfer encoding).
>
> - There are three implementations that satisfy the IO module in the tree:
> Lwt, Async and String.  The first two are full cooperative threading
> libraries, and the latter is used by the js_of_ocaml backend to read/write
> between Strings.
>
> - Now that IO has been handled, we can send HTTP requests and responses
> from Lwt or Async.  However, at this point some differences appear in the
> implementations of Async and Lwt, notably in how they handle cancellation
> of threads and also higher-level iterators (e.g. Async has Pipes, and Lwt
> has Lwt_stream -- both quite different).  Therefore, we build
> backend-specific Client and Server modules that use their respective
> threading libraries in as native a style as possible, but still reusing the
> core HTTP library from `lib/`.  These can be found in `Cohttp_lwt` and
> `Cohttp_async` respectively.  Dave Scott also wrote an (as yet not merged)
> POSIX blocking version that they use in the XenAPI daemon.
>
> - Lwt comes with an additional twist -- it is portable to both Unix *and*
> the MirageOS, which has no Unix at all!  Lwt makes it possible to define a
> "Lwt core" that uses the portable Lwt thread abstractions, but doesn't use
> any OS-specific functionality.  Thus we can define an HTTP Client and
> Server in Cohttp_lwt, but still not tie ourself to one particular OS.  This
> Cohttp_lwt is then used by the Cohttp_lwt_unix and Cohttp_mirage backends
> to hook it into the operating system.
>
> - There's no commonality at present between Cohttp_async and Cohttp_lwt,
> but that's the topic of a design discussion at the moment.  It should be
> possible to build a common signature between the two, and Rudi Grinberg
> took a shot at this a while back.  I'm not sure that it's worth the trouble
> right now.
>
> - Andy Ray did something interesting with the Lwt backend: he ported it to
> JavaScript by implementing an IO backend that marshals the requests to and
> from strings.  This allows REST API users built over Cohttp (such as
> ocaml-github) to compile to pure JavaScript as well.
>
> Drawbacks:
>
> - The heavy use of functors does make it hard to navigate the 'end user'
> API, even though those interfaces never expose any functors (for instance,
> you just use Cohttp_lwt_unix directly in most cases). This is a drawback of
> current OCaml tooling, and Merlin (for IDEs) and Codoc (for
> cross-referenced documentation) will fix this soon.
>
> - A bigger problem that needs to be addressed in Cohttp2 is body handling,
> which we basically got wrong in this iteration.  The Body module is not
> idempotent, so to_string does not always return the same value if called
> multiple times.  The caller can currently be careful, but this is just an
> awful part of the API.  There are enough users of Cohttp that we'll leave
> it for 1.0, but hopefully fix it quite rapidly for 2.0.
>
> - Cohttp is not a complete HTTP client, and doesn't implement the full
> logic for redirections, loop detection and so on.  That's the job of a
> library built over it, and there is some nascent code in opam-mirror that
> can do this [1].  Before building this, David Sheets and I want to look at
> some of the more larger API clients built using it (such as Vincent
> Bernardoff's BitStamp API [2]) and take a shot at a portable client API
> that will work with both Lwt and Async.
>
> So was functorising this heavily such a good idea?  I think so -- the
> litmus test is whether or not there is more than one different
> implementation for each parameterised module, and this has worked out
> particularly well for the Cohttp_lwt backend, where there are now 4 (!)
> very different implementations.
>
> Hope this helps,
> Anil
>
> [1]
> https://github.com/avsm/opam-mirror/blob/master/opam_mirror_fetch_urls.ml#L49
> [2] https://github.com/vbmithr/ocaml-bitstamp-api
>
> _______________________________________________
> MirageOS-devel mailing list
> [email protected]
> http://lists.xenproject.org/cgi-bin/mailman/listinfo/mirageos-devel
>
>

_______________________________________________
MirageOS-devel mailing list
[email protected]
http://lists.xenproject.org/cgi-bin/mailman/listinfo/mirageos-devel

Re: [MirageOS-devel] Cohttp Design -- LWT, Async, JS, Mirage Compatibility

Reply via email to