Re: [E-devel] [RFC] Eina_Slice and Eina_Rw_Slice

Gustavo Sverzut Barbieri Mon, 15 Aug 2016 21:44:49 -0700

On Mon, Aug 15, 2016 at 11:37 PM, Carsten Haitzler <[email protected]> wrote:
> On Mon, 15 Aug 2016 22:35:58 -0300 Gustavo Sverzut Barbieri
> <[email protected]> said:
>
>> On Mon, Aug 15, 2016 at 8:13 PM, Carsten Haitzler <[email protected]>
>> wrote:
>> > On Mon, 15 Aug 2016 12:07:16 -0300 Gustavo Sverzut Barbieri
>> > <[email protected]> said:
[...]
>> It is the same, but you do not need to replicate this in every class
>> like done in Ecore_Exe, Ecore_Con, Ecore_Con_URL... :-)
>>
>> I was thinking just like you, but after talking to Tasn a bit I got
>> what he meant with a "thin wrapper around syscalls" and at the end it
>> does make sense, more sense actually.
>
> if it is just a thin wrapper then what value does it provide?


Uniform access to the calls.

Like in Linux, you do get read(2),write(2),close(2) and file
descriptors to work on almost every basic resource. But when you go to
higher level resources, like when doing HTTP over libcurl, then you
cannot call "read(2)" directly...

With the API I'm proposing you get that simplicity of Unix FD's back.
It's almost the same call and behavior.

Then you can write a simple code that monitors a source, see when
there is data to read, read some data, wait until the destination can
hold more data, then write it... in a loop. This is the Efl.Io.Copier.

Check:
https://git.enlightenment.org/core/efl.git/log/?h=devs/barbieri/efl-io-interfaces

You will see I already provide Stdin, Stdout, Stderr and File. Those
are "useless" since you could do with pure POSIX calls. But when I add
the objects implemented on complex libraries such as cURL, then that
code will "just work".


>> >> Efl.Io.Copier does and keeps a "read_chunk" segment that is used as
>> >> memory for the given slice.
>> >>
>> >> This is why the Eina_Slice and Eina_Rw_Slice plays well in this
>> >> scenario. For example you can get a slice of the given binbuf in order
>> >> to handle to other functions that will write/read to/from it. It
>> >> doesn't require any a new binbuf to be created or COW logic.
>> >
>> > it requires a new eina slice struct to be allocated that points to the data
>> > which is EXACTLY the below binbuf api i mention.
>>
>> eina slice is a pair of 2 values, will always be. There is no opaque
>> or need for pointer, or allocate. The eina_slice.h API is mostly about
>> passing struct value, not reference/pointer.
>>
>> with binbuf indeed you're right, given its complexity you end with an
>> allocated opaque memory handle, magic validation, etc.
>
> and that's what a slice is - it's an allocated opaque handle over a blob of
> memory... is that not just binbuf?

it's not opaque handle. It's a public structure, you allocate it on
stack... same cost as doing "const void *x, size_t xlen". But the pair
is carried, in sync, easy to use, easy to understand.


>> This can be done transparently with the current API proposal.
>
> but in your current one - writes will fail because you don't allocate or 
> expand
> an existing buffer - right? once full.. then what?

It's just like read(2)/write(2) that you know very well. If you want
to copy using them, you need an intermediate buffer.

Efl.Io.Copier is that code and holds that buffer. You can limit it or not.

If unlimited, reads() up to a maximum chunk size and keeps expanding
the buffer. Once write() returns positive value, that amount is
removed from the buffer, that can shrink.

If limited, it will stop monitoring read (partially implemented), thus
will not call read(2), thus will not reach the kernel and eventually
its internal buffer will be full and the writer process will be
informed.


>> >>  read:
>> >> https://git.enlightenment.org/core/efl.git/diff/src/lib/ecore/efl_io_reader_fd.c?h=devs/barbieri/efl-io-interfaces&id=7895d243bd204ecf986292da4866dd84cceb7c30
>> >> write:
>> >> https://git.enlightenment.org/core/efl.git/diff/src/lib/ecore/efl_io_writer_fd.c?h=devs/barbieri/efl-io-interfaces&id=7895d243bd204ecf986292da4866dd84cceb7c30
>> >>
>> >> If we convert to handle an Eina_Binbuf *buf to them, we'd need to add
>> >> few more parameters, like:
>> >>  - read/write: offset and size, since you may want just a small part
>> >> of the buffer to be used (like in Efl.Io.Copier if you do
>> >> line-buffering). Offset and size is all Eina_Slice is about.
>> >
>> > how? see below. it already does this. you can wrap a binbuf around any
>> > address
>> > + length. it can return the base pointer. the bytes you want are base +
>> > offset in the "array" returned.
>>
>> If you create another binbuf "viewing" (managed + ro) the parent,
>> okay. However it is not clear to user, such as:
>>
>> pass a binbuf to read: how much is it going to read? from 0 to
>> eina_binbuf_length_get()?  What if eina_binbuf_length_get() == 0? Will
>> it resize my buffer to read more? How do I limit those? How do I know
>> how much was read? checking eina_binbuf_length_get()?
>
> if you crewate a binbuf from an existing pointer- then yes. it'd give access 
> to
> just that memory range (though if you go beyond the length of the mem region 
> or
> before the start,  behaviour is undefined - in  c/c++ as usual). same with
> slice - right? if you expose a pointer at all... if you only make it work by
> copies in and out of the interface then it's not zero copy. :)

slices do not handle memory access at all. It's just exposed, see the
code. There are some static inline helpers in eina_inline_slice.x to
save some typing like memchr(), memcpy()... If you're reading from the
kernel, just use read(fd, slice.mem, slice.len)...


>> These are conventions we'd need to create and enforce.  With slice
>> it's plain and thus clear, there is no possibility it would realloc,
>> copy-on-write, grow... so it's what you passed: mem + size.
>
> binbuf can do this too - fail to append for example if its read-only or
> "adopted ptr".

could do, but currently AFAIU it will copy-on-write. Thus another
flag/mode would be needed.


>> >>  - read/write return parameter "used" with the amount of bytes that
>> >> were processed, otherwise user needs to store previous
>> >> eina_binbuf_length_get(), get the new size and compute himself.
>> >>
>> >>  - read: a base chunk size, after all we're not doing per-byte calls
>> >> to read(2) and going with a fixed large enough parameter is kinda of
>> >> not-optimal.
>> >>
>> >> Then you can see the binbuf could be used, I just think they wouldn't
>> >> be that convenient. I'll keep going like this and convert to Binbuf is
>> >> Slice is to be disliked once people try to use it for real :-)
>> >
>> > they'll be the same. if there is a specific feature missing - then why not
>> > just add it? i don't see a feature missing. a binbuf can wrap any arbitrary
>> > blob of bytes.
>>
>> currently the only features missing I see are simple to implement,
>> however the usage/clarify may not be fixed.
>>
>> features missing would be some kind of "expand" or "resize" the
>> backing buffer without "append()". This could be used to expand buffer
>> in-place then use a pointer to it with other calls (such as read(2)).
>
> so question... where would you need this for efl.net? other than just filling 
> a
> single binbuf with data either on the "app" side or inside efl.net when 
> reading
> a socket... - both of which can just alloc a buffer - write n bytes to it then
> create a new binbuf around that ptr (and binbuf can handle freeing it later 
> too
> - maybe you are missing a custom "free/release" func in binbuf)

that free/release is one of the missing bits, but I'm not even talking
about those at this moment. It's on the usability side.

it would help if you check the branch I'm pointing.

> the reason i keep going on about this is... i see slice right now as simply
> duplicating binbbuf and thus i see it as adding to learning curves ALSO adding
> to the cost of writing manual bindings for efl for a specific language target
> etc. etc.

Take a look a the code I've uploaded, there are some tests for
eina_slice and some basic objects were changed to use/return it.

Take all the code named after "append_length(mem, size)"... these
could all be "append_slice(slice)". It makes it easy to understand
these are related. It's easier to write bindings.

Same on return. Take eina_binbuf_string_get()... it's not usable
without eina_binbuf_length_get(), since the binbuf is binary, so you
really need the memory AND the length.

Thus eina_binbuf_slice_get() is cleaner and easier to use (not to say
that reduces the API size if we ignore legacy).



>> eventually "ro" flags (bool) could be changed to some more conditions,
>> such as fixed capacity (no realloc), this can be used when you want to
>> avoid reallocs() on reset, remove or append.
>>
>> bottom line is: feature wise, it's easy to implement. But the number
>> of features and lack of clarity on how it's supposed to be used is
>> what's bothering me :-/
>
> then maybe documentation and sample code will make it clear. the SIMPLE usage
> is:
>
> on writes:
>   1. ask efl.net to create a write buffer.
>   2. append to write buffer (binbuf_append - yes its a copy!)
>   3. send/submit the binbuf for sending (efl.net adds toa  list of pending to
>      write binbufs
>   4. inside efl.net the binbuf is freed (or returned toa  pool of buffers 
> ready
>      to re-use)

In a way, this is what Efl.Io.Copier is, not tied to network of course.

something to consider in the list-of-binbuf above is that binbuf grows
to avoid reallocs, so depending on the usage it can hit high rate of
unused memory.

then I'm using a single binbuf right now, can measure and see if
that's a performance hit in the future.


> on reads:
>   1. efl.net creates a binbuf of UP to N bytes in size
>   2. efl.net read()s into this binbuf up to the max length
>   3. if length shorter, set binbuf len to be this shortened length
>   4. append read binbuf to "read queue"
>   5. walk through read queue calling callbacks on read buffers
>   6. once callback has been called, return binbuf to the pool

https://git.enlightenment.org/core/efl.git/tree/src/lib/ecore/efl_io_copier.c?h=devs/barbieri/efl-io-interfaces
sounds familiar? ;-) (just remember it's using a single binbuf and
still not reading directly to it as it should, there is a TODO for
that in that file).

[...]

>> >> > one way or another i don't think you can avoid allocating some object
>> >> > that represents a blob of data. if its a slice, or a binbuf - it doesn't
>> >> > matter. my point here is - create a binbuf and PASS It IN as a whole.
>> >> > it's immutable once passed to efl.net or when efl.net passes it back to
>> >> > the caller in events. the binbuf size is as large as is needed to
>> >> > transport that piece of data.
>> >> >
>> >> > all your slice is is a binbuf under another name here. once you fix it
>> >> > and do it right and allocate/free when done.
>> >>
>> >> the slice is just a view of memory, it doesn't free/allocate/grow.
>> >
>> > you have to allocate the Eina_Slice * struct. either way. if you want to
>> > return one. eina_binbuf_manage_new_length() already is there and does this.
>> > you are creating more api's that do the same thing. we have enough api's in
>> > efl that duplicate functionality. explain how slice is different. you still
>> > have to alloc the slice STRUCT. not the data buffer. the struct that wraps
>> > the data buffer. that is precisely what the above already does.
>>
>> see the code, there is no slice allocation, the struct is passed as
>> value. It's plain clear that it's simply memory and its size, no
>> copy-on-write, no realloc, no alloc, no free.
>>
>> It's the same as passing "void *, size_t", but you force these to be
>> linked, hinting their relationship, allowing bindings and the likes to
>> map to them.
>
> sure - you bound ptr+size in a datatype. you will have to ALLOCATE these if 
> you
> are to have any buffering. then what about the backing data  behind the slice?
> how can you keep it around when buffered? the slice doesnt have any ownership
> of the data pointed to...

this is right, it doesn't have any ownership. The buffering is outside
of it, it's just used to refer to the buffered regions we want to deal
with, like the slice we want to append, the slice we want to write,
the slice we want to read...


> unless you intend to have no buffering. everything is just simple direct to
> kernel read/write and if a write fails - pass back that failure, OR make 
> writes
> block. ?

See the explanations and code :-) the buffering is handled at another level.



-- 
Gustavo Sverzut Barbieri
--------------------------------------
Mobile: +55 (16) 99354-9890

------------------------------------------------------------------------------
_______________________________________________
enlightenment-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

Re: [E-devel] [RFC] Eina_Slice and Eina_Rw_Slice

Reply via email to