> One bit of food for thought: you can't exactly mmap() in elm, and even to get > from bytes to `Array Int` you have to do some non-trivial unmarshalling. It > may make as much sense to just bite the bullet and parse the whole thing > (deeply) into an idiomatic data type up front, just like implementations > protobufs; by the time you take into account all of Elm's limitations, it's > not clear to me how much keeping it in the wire format like the C++ > implementation does actually buys you.
> Doing an up front parse solves a lot things, so if you go another route, be > clear on why. This is intended to be a prototype. `Array Int` is an awful data type for this use case, but it works well enough for prototyping. Specifically, for the context of others reading this thread, `Array` in Elm is a tree structure and will not provide reasonable performance. That said, I’m hoping that I can use `elm-bytes<https://github.com/elm/bytes>` in a better way than being forced to decode it into a full Elm data type – though as I say this, it seems that I’d need some buy-in that I can’t be guaranteed. I may end up with that solution in the long run, but I want to implement one with the double array to start and see if I can convince a few people. Still, it’s not clear to me why you’d use Cap’n Proto if you’re going to do a full serialization/deserialization. Just use Protobufs at that point. You could argue that this existing for completeness is valuable i.e. you can run capnp on your backend and not be forced to translate into a protobuf on your frontend, but at that point. I’m not sure that this is a good enough reason to write a library like this. Additionally, it’s not like JavaScript doesn’t have more complex capabilities like Uint8Array that Elm could take advantage of. That said, I’m more or less treating this as immutable data and providing ways of reducing the cost of updates (such as batching updates). Haskell at least has the ST monad for performance. There just isn’t a better way of doing this in Elm as far as I know. > It would make sense to publish this as a package by itself; it's a nice > conceptual unit that would be useful as a library for other projects. Sure, I was thinking the same thing. Just thought that I’d focus on the Capnproto implementation before publishing. It’s fairly separate though, so I’m not worried about separating it out once I’m ready. > This was a somewhat awkward thing to cover with the Haskell > implementation; what I ended up doing amounts to a glorified state > monad: So the `Struct` type is a glorified state monad. `fields` holds the record that acts as the struct’s definition. I’ve attached an example below that shows how I think this should work. Let me know if that makes sense and feels reasonably ergonomic. Regarding namespacing in the parallel conversation: I think it’s kind of awful that Haskell records are accessed via functions instead of some scoped operator or the like. Not really useful as a comment, but I thought I’d add my displeasure. Pointer field defaults: Field defaults in general are not features I feel super great about. Not that I’ve thought about this in horribly great depth, but they seem to be very problematic if they are ever updated – your binaries will read the same bytes as two different structs. I always assumed that’s why they were removed from proto3. They also don’t seem *that* useful as you can handle this on the application layer sufficiently well. I’m curious if others think differently and feel strongly about their inclusion. getMainPhone : Struct AddressBook -> Struct PhoneNumber getMainPhone s = let s : Struct AddressBook in s |> Capnp.get .people |> Capnp.List.get 0 AddressBook.person |> Capnp.get .mainPhone AddressBook.person_phoneNumber -- assume d : Data exists. This is an `Array (Array Int)` -- Inputs: -- Struct -- { data = d -- , fields = -- -- Field AddressBook (Capnp.List.List (StructField Person)) -- { people = ... -- } -- , viewOffset = (0, 0) -- , currentTraversalDistance = 0 -- , traversalLimit = 67108864 -- } -- Outputs: -- Struct -- { -- Data has not been updated. Hopefully, d is not actually copied, -- -- and is simply a pointer, but I’m not sure how this works exactly. -- -- If I have to, I can always separate d from the struct definition. -- data = d -- , fields = -- -- Fields have been updated to a PhoneNumber -- { number = ... -- , type = ... -- } -- , -- View Offset represents the index into the data above. -- -- Updated as necessary. We assume that the new offset is 40 here. -- viewOffset = (0, 40) -- , -- Data traversed so far. Assume that we've only traversed 30 bytes for -- w/e reason. -- currentTraversalDistance = 30 -- , traversalLimit = 67108864 -- } From: David Renshaw <dwrens...@gmail.com> Sent: Thursday, May 30, 2019 5:30 AM To: Ian Denhardt <i...@zenhack.net> Cc: prasanth somasundar <mezu...@live.com>; capnproto <capnproto@googlegroups.com> Subject: Re: [capnproto] Cap'n Proto for Elm Thanks! I wrote some comments inline below. On Wed, May 29, 2019 at 11:38 PM Ian Denhardt <i...@zenhack.net<mailto:i...@zenhack.net>> wrote: Quoting David Renshaw (2019-05-29 21:33:03) > This has piqued my interest. Which parts of the schema language don't > map well to Haskell/Elm? The biggest one is nested namespaces, per discussion. Neither language has intra-module namespaces, so you either end up doing a bunch of complex logic to split stuff across multiple modules and still break dependency cycles (in Haskell; per my earlier message, in Elm you're just SOL, since mutually recursive modules are just not supported, full stop), or you deal with long_names_with_underscores (Haskell actually uses the single quote as a namespace separator). This is a problem for the Go implementation as well; some of the stuff from sandstorm's web-session.capnp spits out identifiers that are pushing 100 characters. (I actually bumped into @glycerine at a meetup just the other day; we talked about this among other things). That's unfortunate. The fact that union field names are scoped to the struct is a bit awkward, since union tag names are scoped at the module level in most ML-family languages. More makeshift namespacing. Sounds like this is awkward mainly because of the previous problem, i.e. Haskell lacks nested namespaces. With nested namespaces, you would define your union datatype within the namespace of the enclosing struct, and the tag names would have exactly the right namespace. The lack of a clean separation between unions and structs introduces a bit of an impedance mismatch as well; if you do things naively you end up with an awkward situation where *every* sum type is wrapped in a struct, which is a bit odd since they are used so liberally (and are normally so lightweight) in these languages. The Haskell implementation specifically looks for structs which are one big anonymous union so it can omit the wrapper. If you have an anonymous union you also need to invent a name for the field, since you can't actually have "anonymous" fields in records. `which` is the usual name for such a field, as in: https://github.com/capnproto/capnproto/blob/0f368d5781872ffc3e63db54b0ac4a138b0e0a05/c%2B%2B/src/capnp/encoding-test.c%2B%2B#L121 For Haskell, there's no way to talk about a record type without giving it a name, so every group needs an auxiliary type defined. There's not really anything clearly nicer to do than just name it <Type>'<field> or such, which makes the long name problem worse. Along similar lines, in Haskell you end up having to define auxiliary types for parameter and return types, and without more of a hint the end up being things like <Type>'<method>'params and <Type>'<method>'results -- a mouthful even for short type names. I've taken to just always manually giving my parameter and return arguments names to avoid this kind of compiler output; the schema is much more verbose, but the call site is much nicer. None of this section applies to Elm since you can just have anonymous record types. Again, sounds like this is awkward mainly as a consequence of Haskell's lack of nested namespaces. I intentionally decided to just not support custom default values for pointer fields; it gets really awkward because messages can be mutable or immutable, and you end up needing different implementation strategies for each type; for immutable messages you can't do what most implementations do (copy the value in place on first access), Copying into an immutable message would mean mutating it, so I agree that's not a good way to go. but you could "follow" the pointer into some constant defined in the generated code without a copy. But that gets weird because there are functions to access the underlying message/segment, so you could run into situations where you've jumped to a whole other message silently. Are there reasons that client code needs to use these functions? If not, is there a way for you to hide them or mark them as internal-use-only? With mutable messages you can do the normal thing, but writing code that's generic over both of these gets really weird. At some point I ended up checking the schema that ship with capnproto, and with sandstorm, and discovered that, in >9000 lines of schema source, the feature was used exactly twice, both to set the default value of a text parameter to the empty string. So I just said "screw it, this is a waste of time." The plugin just prints a warning to stderr and ignores the custom default. For what it's worth, I actually had someone request this feature last month: https://github.com/capnproto/capnproto-rust/issues/127 I'm not sure what their use case is, though. I actually have a much longer critique that I think would be worth writing, including some things that aren't a problem for Haskell specifically, but cause problems for other languages -- and I am being bothered to go help with dinner, so I'll leave it at this for now. I'd be eager to read the longer critique! -Ian -- You received this message because you are subscribed to the Google Groups "Cap'n Proto" group. To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+unsubscr...@googlegroups.com<mailto:capnproto%2bunsubscr...@googlegroups.com>. Visit this group at https://groups.google.com/group/capnproto. To view this discussion on the web visit https://groups.google.com/d/msgid/capnproto/155918727572.10312.15632533580192568031%40localhost.localdomain. -- You received this message because you are subscribed to the Google Groups "Cap'n Proto" group. To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+unsubscr...@googlegroups.com. Visit this group at https://groups.google.com/group/capnproto. To view this discussion on the web visit https://groups.google.com/d/msgid/capnproto/BYAPR11MB25992AF5A436A2956118BE83C5180%40BYAPR11MB2599.namprd11.prod.outlook.com.