Re: [E-devel] [RFC] Eina_Slice and Eina_Rw_Slice

The Rasterman Mon, 15 Aug 2016 19:39:07 -0700

On Mon, 15 Aug 2016 22:35:58 -0300 Gustavo Sverzut Barbieri
<[email protected]> said:


> On Mon, Aug 15, 2016 at 8:13 PM, Carsten Haitzler <[email protected]>
> wrote:
> > On Mon, 15 Aug 2016 12:07:16 -0300 Gustavo Sverzut Barbieri
> > <[email protected]> said:
> >
> >> > i'm still not that happy. first slice should not be void *. it needs a
> >> > proper type. unsigned char *, char * - at least bindings can expose a
> >> > useful type, though seriously - some languages just are broken (js for
> >> > example only has a number type... it has no concept of a blob of binary
> >> > data unless you want to do something silly like an array of numbers...
> >> > where you strictly use the value ranges 0 to 255 per number thus using
> >> > 64bits per 8byt value effectively... ugh :( ... but that's what we have,
> >> > so unsigned char or char arrays... (or pointers) not void *.
> >>
> >> I'm playing with Eina_Slice idea in my branch, in there I found that
> >> an union "uint8_t *bytes" is good to do pointer arithmetic while "void
> >> *mem" is good since we can avoid casts.
> >>
> >> As for JS, you do have buffer and binary arrays, most engines support
> >> those due performance reasons you've mentioned.
> >>
> >>
> >> > technically if you REALLY want zero copy, then any read or write buffers
> >> > MUST be allocated by efl.net - this slice interface will have t EXPOSE a
> >> > region of a buffer to write to or read from. the allocation of that
> >> > buffer is managed by efl.net - it grants access to the slice.
> >>
> >> see my code, this is done inside efl, but by a different entity as
> >> Efl.Net stuff keeps no buffers at all.
> >
> > someone, somewhere will have to keep buffers. if efl.net no longer can do
> > buffering for you (just like the kernel does - but the kernel is limited in
> > its buffer size), then you push that buffering back up into an app where it
> > has to handle write fails and now has to keep data in an internal buffer of
> > its own. if the api user is now left to handle write fails and have to do
> > their own buffering then this api is going to be a failure. if you require
> > them to instantiate multiple objects and bind them together in a pipeline
> > to get this, then it's a failure. if buffering isn't transparent ... it's a
> > failure as an api. there is no value in this api beyond a simple read() and
> > write() then so why not just expose an fd and say "go for it - do it
> > yourself". you are heading this way, and this imho has no value as a layer
> > to make things easier to use. just some code to do connects/binds/listens
> > and then exposing an fd and say "use read/write" ...
> 
> It is the same, but you do not need to replicate this in every class
> like done in Ecore_Exe, Ecore_Con, Ecore_Con_URL... :-)
> 
> I was thinking just like you, but after talking to Tasn a bit I got
> what he meant with a "thin wrapper around syscalls" and at the end it
> does make sense, more sense actually.

if it is just a thin wrapper then what value does it provide?

> >> Efl.Io.Copier does and keeps a "read_chunk" segment that is used as
> >> memory for the given slice.
> >>
> >> This is why the Eina_Slice and Eina_Rw_Slice plays well in this
> >> scenario. For example you can get a slice of the given binbuf in order
> >> to handle to other functions that will write/read to/from it. It
> >> doesn't require any a new binbuf to be created or COW logic.
> >
> > it requires a new eina slice struct to be allocated that points to the data
> > which is EXACTLY the below binbuf api i mention.
> 
> eina slice is a pair of 2 values, will always be. There is no opaque
> or need for pointer, or allocate. The eina_slice.h API is mostly about
> passing struct value, not reference/pointer.
> 
> with binbuf indeed you're right, given its complexity you end with an
> allocated opaque memory handle, magic validation, etc.

and that's what a slice is - it's an allocated opaque handle over a blob of
memory... is that not just binbuf?

> >> I'll keep going like this for a while, before merging I'll give a
> >> heads up so you guys can see it in use... if you happen to dislike it,
> >> then it's not that difficult to convert (ie: remove eina_slice.h and
> >> fix compile errors -- as opposed to manually differentiate all
> >> Eina_Binbuf users to see what should be a slice and what should not).
> >>
> >>
> >> > also just like binbuf you are going to have to ALLOCATE a slice every
> >> > time you get one... just the same. what you do in your example above it
> >> > illegal in c and c++. you're returning stack stuff that has been popped.
> >> > to return something that will survive it has to allocate.. and this
> >> > would be the same in any language that's the same.
> >>
> >> Not really, the entity is different. Efl.Io.Copier does what you want,
> >> and keeps a single Eina_Binbuf where read-data is stored until it can
> >> be pushed to the writer (destination).
> >>
> >> Both read and write slices point to this internal Eina_Binbuf, thus a
> >> single storage. The API to read() and write() are simple as you can
> >> see in my code:
> >
> > i think this is the failure in design. forcing there to be a single buffer
> > only. if you do this then this limits all your other designs to work around
> > it.
> 
> not really, not at all. You can implement the buffers if you want. You
> can do the threaded design if you want. You can do this:
> 
> > keep a LIST of write buffers, and a LIST of read buffers. call callbacks on
> > the read buffers once they have come in. write buffers - keep in list and
> > spool out as write becomes available.
> >
> > personally i'd implement this with a thread these days. have a i/o slave
> > that does all the reading and writing and it will allocate any new buffers
> > as needed and wrap[ a binbuf around them and hand them out to users of the
> > api, or allocate buffers for writers, let them write to the buffer, then
> > add them to the send queue when sent... there is a sender thread just for
> > this (maybe a single send thread for all of efl.net, and a single reader
> > thread to avoid thread count bloat - we can later make more threads if
> > performance requires it and there are enough connections to warrant it).
> 
> This can be done transparently with the current API proposal.

but in your current one - writes will fail because you don't allocate or expand
an existing buffer - right? once full.. then what?

> >>  read:
> >> https://git.enlightenment.org/core/efl.git/diff/src/lib/ecore/efl_io_reader_fd.c?h=devs/barbieri/efl-io-interfaces&id=7895d243bd204ecf986292da4866dd84cceb7c30
> >> write:
> >> https://git.enlightenment.org/core/efl.git/diff/src/lib/ecore/efl_io_writer_fd.c?h=devs/barbieri/efl-io-interfaces&id=7895d243bd204ecf986292da4866dd84cceb7c30
> >>
> >> If we convert to handle an Eina_Binbuf *buf to them, we'd need to add
> >> few more parameters, like:
> >>  - read/write: offset and size, since you may want just a small part
> >> of the buffer to be used (like in Efl.Io.Copier if you do
> >> line-buffering). Offset and size is all Eina_Slice is about.
> >
> > how? see below. it already does this. you can wrap a binbuf around any
> > address
> > + length. it can return the base pointer. the bytes you want are base +
> > offset in the "array" returned.
> 
> If you create another binbuf "viewing" (managed + ro) the parent,
> okay. However it is not clear to user, such as:
> 
> pass a binbuf to read: how much is it going to read? from 0 to
> eina_binbuf_length_get()?  What if eina_binbuf_length_get() == 0? Will
> it resize my buffer to read more? How do I limit those? How do I know
> how much was read? checking eina_binbuf_length_get()?

if you crewate a binbuf from an existing pointer- then yes. it'd give access to
just that memory range (though if you go beyond the length of the mem region or
before the start,  behaviour is undefined - in  c/c++ as usual). same with
slice - right? if you expose a pointer at all... if you only make it work by
copies in and out of the interface then it's not zero copy. :)

> These are conventions we'd need to create and enforce.  With slice
> it's plain and thus clear, there is no possibility it would realloc,
> copy-on-write, grow... so it's what you passed: mem + size.

binbuf can do this too - fail to append for example if its read-only or
"adopted ptr".

> >>  - read/write return parameter "used" with the amount of bytes that
> >> were processed, otherwise user needs to store previous
> >> eina_binbuf_length_get(), get the new size and compute himself.
> >>
> >>  - read: a base chunk size, after all we're not doing per-byte calls
> >> to read(2) and going with a fixed large enough parameter is kinda of
> >> not-optimal.
> >>
> >> Then you can see the binbuf could be used, I just think they wouldn't
> >> be that convenient. I'll keep going like this and convert to Binbuf is
> >> Slice is to be disliked once people try to use it for real :-)
> >
> > they'll be the same. if there is a specific feature missing - then why not
> > just add it? i don't see a feature missing. a binbuf can wrap any arbitrary
> > blob of bytes.
> 
> currently the only features missing I see are simple to implement,
> however the usage/clarify may not be fixed.
> 
> features missing would be some kind of "expand" or "resize" the
> backing buffer without "append()". This could be used to expand buffer
> in-place then use a pointer to it with other calls (such as read(2)).

so question... where would you need this for efl.net? other than just filling a
single binbuf with data either on the "app" side or inside efl.net when reading
a socket... - both of which can just alloc a buffer - write n bytes to it then
create a new binbuf around that ptr (and binbuf can handle freeing it later too
- maybe you are missing a custom "free/release" func in binbuf)

the reason i keep going on about this is... i see slice right now as simply
duplicating binbbuf and thus i see it as adding to learning curves ALSO adding
to the cost of writing manual bindings for efl for a specific language target
etc. etc.

> eventually "ro" flags (bool) could be changed to some more conditions,
> such as fixed capacity (no realloc), this can be used when you want to
> avoid reallocs() on reset, remove or append.
> 
> bottom line is: feature wise, it's easy to implement. But the number
> of features and lack of clarity on how it's supposed to be used is
> what's bothering me :-/

then maybe documentation and sample code will make it clear. the SIMPLE usage
is:

on writes:
  1. ask efl.net to create a write buffer.
  2. append to write buffer (binbuf_append - yes its a copy!)
  3. send/submit the binbuf for sending (efl.net adds toa  list of pending to
     write binbufs
  4. inside efl.net the binbuf is freed (or returned toa  pool of buffers ready
     to re-use)

on reads:
  1. efl.net creates a binbuf of UP to N bytes in size
  2. efl.net read()s into this binbuf up to the max length
  3. if length shorter, set binbuf len to be this shortened length
  4. append read binbuf to "read queue"
  5. walk through read queue calling callbacks on read buffers
  6. once callback has been called, return binbuf to the pool

that's simple. efl.net COULD accept foreign binbufs it didn't allocate for
writes but this may involve an extra copy. the reason to have efl/net do the
alloc for read and write buffers is that in theory this is the only way to have
network sends be zero copy as this data can be in dma memory explicitly. yes -
this doesn't actually happen today and the kernel will always do a copy when you
write() into an internal buffer.

on read - same reason. dma memory. if you want the data somewhere else there
will likely be a memcpy anyway. when you do a read() a kernel will memcpy into
your userspace buffer from hw dma buffers anyway.

we could allow an "accelerated interface" where the writer allocs the binbuf
and fills it and hands to efl.net. similar for read. the "app" would create a
binbuf and provide to efl.net to write into (along with some rules like MUST
write a MINIMUM of n bytes or a MAXIMUM of n bytes but allow less). this would
be a little more work on the read side. than the simple example above.

if its a binbuf or a slice i dont think matters here. they are just views of
the same thing - a blob of memory. :)

> >> > one way or another i don't think you can avoid allocating some object
> >> > that represents a blob of data. if its a slice, or a binbuf - it doesn't
> >> > matter. my point here is - create a binbuf and PASS It IN as a whole.
> >> > it's immutable once passed to efl.net or when efl.net passes it back to
> >> > the caller in events. the binbuf size is as large as is needed to
> >> > transport that piece of data.
> >> >
> >> > all your slice is is a binbuf under another name here. once you fix it
> >> > and do it right and allocate/free when done.
> >>
> >> the slice is just a view of memory, it doesn't free/allocate/grow.
> >
> > you have to allocate the Eina_Slice * struct. either way. if you want to
> > return one. eina_binbuf_manage_new_length() already is there and does this.
> > you are creating more api's that do the same thing. we have enough api's in
> > efl that duplicate functionality. explain how slice is different. you still
> > have to alloc the slice STRUCT. not the data buffer. the struct that wraps
> > the data buffer. that is precisely what the above already does.
> 
> see the code, there is no slice allocation, the struct is passed as
> value. It's plain clear that it's simply memory and its size, no
> copy-on-write, no realloc, no alloc, no free.
> 
> It's the same as passing "void *, size_t", but you force these to be
> linked, hinting their relationship, allowing bindings and the likes to
> map to them.

sure - you bound ptr+size in a datatype. you will have to ALLOCATE these if you
are to have any buffering. then what about the backing data  behind the slice?
how can you keep it around when buffered? the slice doesnt have any ownership
of the data pointed to...

unless you intend to have no buffering. everything is just simple direct to
kernel read/write and if a write fails - pass back that failure, OR make writes
block. ?

> Also having two types just to enforce const-ness, nobody can get a
> "const Eina_Slice" that takes a "const void *mem" (union with "const
> uint8_t *bytes" for ease of use) and think that this memory will be
> modifiable. Likewise, getting a "Eina_Rw_Slice" with "void *mem" is
> clear that the purpose is to write to those bytes.
> 
> Say you want to write an Eina_Stringshare to some writer object:
> 
>     Eina_Slice slice = eina_stringshare_slice_get(stringshared_stuff);
> // clear it's read-only
>     Eina_Slice remaining;
>     efl_io_writer_write(o, &slice, &remaining);
>     printf("wrote[" EINA_SLICE_STR_FMT "], remaining[" EINA_SLICE_STR_FMT "]
> \n", EINA_SLICE_STR_PRINT(slice), EINA_SLICE_STR_PRINT(remaining));
> 
> With Eina_Binbuf:
> 
>     Eina_Binbuf *slice =
> eina_stringshare_binbuf_get(stringshared_stuff); // new method, same
> number of calls... but people are unsure what happens if they write to
> it... is it going to be modified? copied? will the result be a
> stringshare as well?
>     EINA_SAFETY_ON_NULL_RETURN(slice); // need to check
>     Eina_Binbuf *remaining = NULL;
>     efl_io_writer_write(o, &slice, &remaining); // same call
>     printf("wrote[" EINA_BINBUF_STR_FMT "], remaining["
> EINA_BINBUF_STR_FMT "]\n",
>         EINA_BINBUF_STR_PRINT(slice), EINA_BINBUF_STR_PRINT(remaining));
>    eina_binbuf_free(slice);
>    if (remaining) eina_binbuf_free(remaining);
> 
> 
> As you see, the number of calls can be made the close, but semantics
> are not that obvious anymore.
> 
> 
> -- 
> Gustavo Sverzut Barbieri
> --------------------------------------
> Mobile: +55 (16) 99354-9890
> 


-- 
------------- Codito, ergo sum - "I code, therefore I am" --------------
The Rasterman (Carsten Haitzler)    [email protected]


------------------------------------------------------------------------------
_______________________________________________
enlightenment-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

Re: [E-devel] [RFC] Eina_Slice and Eina_Rw_Slice

Reply via email to