Re: [E-devel] [RFC] Eina_Slice and Eina_Rw_Slice

Gustavo Sverzut Barbieri Mon, 15 Aug 2016 18:37:11 -0700

On Mon, Aug 15, 2016 at 8:13 PM, Carsten Haitzler <[email protected]> wrote:
> On Mon, 15 Aug 2016 12:07:16 -0300 Gustavo Sverzut Barbieri
> <[email protected]> said:
>
>> > i'm still not that happy. first slice should not be void *. it needs a
>> > proper type. unsigned char *, char * - at least bindings can expose a
>> > useful type, though seriously - some languages just are broken (js for
>> > example only has a number type... it has no concept of a blob of binary
>> > data unless you want to do something silly like an array of numbers...
>> > where you strictly use the value ranges 0 to 255 per number thus using
>> > 64bits per 8byt value effectively... ugh :( ... but that's what we have, so
>> > unsigned char or char arrays... (or pointers) not void *.
>>
>> I'm playing with Eina_Slice idea in my branch, in there I found that
>> an union "uint8_t *bytes" is good to do pointer arithmetic while "void
>> *mem" is good since we can avoid casts.
>>
>> As for JS, you do have buffer and binary arrays, most engines support
>> those due performance reasons you've mentioned.
>>
>>
>> > technically if you REALLY want zero copy, then any read or write buffers
>> > MUST be allocated by efl.net - this slice interface will have t EXPOSE a
>> > region of a buffer to write to or read from. the allocation of that buffer
>> > is managed by efl.net - it grants access to the slice.
>>
>> see my code, this is done inside efl, but by a different entity as
>> Efl.Net stuff keeps no buffers at all.
>
> someone, somewhere will have to keep buffers. if efl.net no longer can do
> buffering for you (just like the kernel does - but the kernel is limited in 
> its
> buffer size), then you push that buffering back up into an app where it has to
> handle write fails and now has to keep data in an internal buffer of its own.
> if the api user is now left to handle write fails and have to do their own
> buffering then this api is going to be a failure. if you require them to
> instantiate multiple objects and bind them together in a pipeline to get this,
> then it's a failure. if buffering isn't transparent ... it's a failure as an
> api. there is no value in this api beyond a simple read() and write() then so
> why not just expose an fd and say "go for it - do it yourself". you are 
> heading
> this way, and this imho has no value as a layer to make things easier to use.
> just some code to do connects/binds/listens and then exposing an fd and say
> "use read/write" ...


It is the same, but you do not need to replicate this in every class
like done in Ecore_Exe, Ecore_Con, Ecore_Con_URL... :-)

I was thinking just like you, but after talking to Tasn a bit I got
what he meant with a "thin wrapper around syscalls" and at the end it
does make sense, more sense actually.

>> Efl.Io.Copier does and keeps a "read_chunk" segment that is used as
>> memory for the given slice.
>>
>> This is why the Eina_Slice and Eina_Rw_Slice plays well in this
>> scenario. For example you can get a slice of the given binbuf in order
>> to handle to other functions that will write/read to/from it. It
>> doesn't require any a new binbuf to be created or COW logic.
>
> it requires a new eina slice struct to be allocated that points to the data
> which is EXACTLY the below binbuf api i mention.

eina slice is a pair of 2 values, will always be. There is no opaque
or need for pointer, or allocate. The eina_slice.h API is mostly about
passing struct value, not reference/pointer.

with binbuf indeed you're right, given its complexity you end with an
allocated opaque memory handle, magic validation, etc.


>> I'll keep going like this for a while, before merging I'll give a
>> heads up so you guys can see it in use... if you happen to dislike it,
>> then it's not that difficult to convert (ie: remove eina_slice.h and
>> fix compile errors -- as opposed to manually differentiate all
>> Eina_Binbuf users to see what should be a slice and what should not).
>>
>>
>> > also just like binbuf you are going to have to ALLOCATE a slice every time
>> > you get one... just the same. what you do in your example above it illegal
>> > in c and c++. you're returning stack stuff that has been popped. to return
>> > something that will survive it has to allocate.. and this would be the same
>> > in any language that's the same.
>>
>> Not really, the entity is different. Efl.Io.Copier does what you want,
>> and keeps a single Eina_Binbuf where read-data is stored until it can
>> be pushed to the writer (destination).
>>
>> Both read and write slices point to this internal Eina_Binbuf, thus a
>> single storage. The API to read() and write() are simple as you can
>> see in my code:
>
> i think this is the failure in design. forcing there to be a single buffer
> only. if you do this then this limits all your other designs to work around 
> it.

not really, not at all. You can implement the buffers if you want. You
can do the threaded design if you want. You can do this:

> keep a LIST of write buffers, and a LIST of read buffers. call callbacks on 
> the
> read buffers once they have come in. write buffers - keep in list and spool 
> out
> as write becomes available.
>
> personally i'd implement this with a thread these days. have a i/o slave that
> does all the reading and writing and it will allocate any new buffers as 
> needed
> and wrap[ a binbuf around them and hand them out to users of the api, or
> allocate buffers for writers, let them write to the buffer, then add them to
> the send queue when sent... there is a sender thread just for this (maybe a
> single send thread for all of efl.net, and a single reader thread to avoid
> thread count bloat - we can later make more threads if performance requires it
> and there are enough connections to warrant it).

This can be done transparently with the current API proposal.



>>  read:
>> https://git.enlightenment.org/core/efl.git/diff/src/lib/ecore/efl_io_reader_fd.c?h=devs/barbieri/efl-io-interfaces&id=7895d243bd204ecf986292da4866dd84cceb7c30
>> write:
>> https://git.enlightenment.org/core/efl.git/diff/src/lib/ecore/efl_io_writer_fd.c?h=devs/barbieri/efl-io-interfaces&id=7895d243bd204ecf986292da4866dd84cceb7c30
>>
>> If we convert to handle an Eina_Binbuf *buf to them, we'd need to add
>> few more parameters, like:
>>  - read/write: offset and size, since you may want just a small part
>> of the buffer to be used (like in Efl.Io.Copier if you do
>> line-buffering). Offset and size is all Eina_Slice is about.
>
> how? see below. it already does this. you can wrap a binbuf around any address
> + length. it can return the base pointer. the bytes you want are base + offset
> in the "array" returned.

If you create another binbuf "viewing" (managed + ro) the parent,
okay. However it is not clear to user, such as:

pass a binbuf to read: how much is it going to read? from 0 to
eina_binbuf_length_get()?  What if eina_binbuf_length_get() == 0? Will
it resize my buffer to read more? How do I limit those? How do I know
how much was read? checking eina_binbuf_length_get()?

These are conventions we'd need to create and enforce.  With slice
it's plain and thus clear, there is no possibility it would realloc,
copy-on-write, grow... so it's what you passed: mem + size.

>>  - read/write return parameter "used" with the amount of bytes that
>> were processed, otherwise user needs to store previous
>> eina_binbuf_length_get(), get the new size and compute himself.
>>
>>  - read: a base chunk size, after all we're not doing per-byte calls
>> to read(2) and going with a fixed large enough parameter is kinda of
>> not-optimal.
>>
>> Then you can see the binbuf could be used, I just think they wouldn't
>> be that convenient. I'll keep going like this and convert to Binbuf is
>> Slice is to be disliked once people try to use it for real :-)
>
> they'll be the same. if there is a specific feature missing - then why not 
> just
> add it? i don't see a feature missing. a binbuf can wrap any arbitrary blob of
> bytes.

currently the only features missing I see are simple to implement,
however the usage/clarify may not be fixed.

features missing would be some kind of "expand" or "resize" the
backing buffer without "append()". This could be used to expand buffer
in-place then use a pointer to it with other calls (such as read(2)).

eventually "ro" flags (bool) could be changed to some more conditions,
such as fixed capacity (no realloc), this can be used when you want to
avoid reallocs() on reset, remove or append.

bottom line is: feature wise, it's easy to implement. But the number
of features and lack of clarity on how it's supposed to be used is
what's bothering me :-/


>> > one way or another i don't think you can avoid allocating some object that
>> > represents a blob of data. if its a slice, or a binbuf - it doesn't matter.
>> > my point here is - create a binbuf and PASS It IN as a whole. it's
>> > immutable once passed to efl.net or when efl.net passes it back to the
>> > caller in events. the binbuf size is as large as is needed to transport
>> > that piece of data.
>> >
>> > all your slice is is a binbuf under another name here. once you fix it and
>> > do it right and allocate/free when done.
>>
>> the slice is just a view of memory, it doesn't free/allocate/grow.
>
> you have to allocate the Eina_Slice * struct. either way. if you want to 
> return
> one. eina_binbuf_manage_new_length() already is there and does this. you are
> creating more api's that do the same thing. we have enough api's in efl that
> duplicate functionality. explain how slice is different. you still have to
> alloc the slice STRUCT. not the data buffer. the struct that wraps the data
> buffer. that is precisely what the above already does.

see the code, there is no slice allocation, the struct is passed as
value. It's plain clear that it's simply memory and its size, no
copy-on-write, no realloc, no alloc, no free.

It's the same as passing "void *, size_t", but you force these to be
linked, hinting their relationship, allowing bindings and the likes to
map to them.

Also having two types just to enforce const-ness, nobody can get a
"const Eina_Slice" that takes a "const void *mem" (union with "const
uint8_t *bytes" for ease of use) and think that this memory will be
modifiable. Likewise, getting a "Eina_Rw_Slice" with "void *mem" is
clear that the purpose is to write to those bytes.

Say you want to write an Eina_Stringshare to some writer object:

    Eina_Slice slice = eina_stringshare_slice_get(stringshared_stuff);
// clear it's read-only
    Eina_Slice remaining;
    efl_io_writer_write(o, &slice, &remaining);
    printf("wrote[" EINA_SLICE_STR_FMT "], remaining[" EINA_SLICE_STR_FMT "]\n",
        EINA_SLICE_STR_PRINT(slice), EINA_SLICE_STR_PRINT(remaining));

With Eina_Binbuf:

    Eina_Binbuf *slice =
eina_stringshare_binbuf_get(stringshared_stuff); // new method, same
number of calls... but people are unsure what happens if they write to
it... is it going to be modified? copied? will the result be a
stringshare as well?
    EINA_SAFETY_ON_NULL_RETURN(slice); // need to check
    Eina_Binbuf *remaining = NULL;
    efl_io_writer_write(o, &slice, &remaining); // same call
    printf("wrote[" EINA_BINBUF_STR_FMT "], remaining["
EINA_BINBUF_STR_FMT "]\n",
        EINA_BINBUF_STR_PRINT(slice), EINA_BINBUF_STR_PRINT(remaining));
   eina_binbuf_free(slice);
   if (remaining) eina_binbuf_free(remaining);


As you see, the number of calls can be made the close, but semantics
are not that obvious anymore.


-- 
Gustavo Sverzut Barbieri
--------------------------------------
Mobile: +55 (16) 99354-9890

------------------------------------------------------------------------------
_______________________________________________
enlightenment-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

Re: [E-devel] [RFC] Eina_Slice and Eina_Rw_Slice

Reply via email to