Re: [E-devel] [RFC] Eina_Slice and Eina_Rw_Slice

The Rasterman Tue, 16 Aug 2016 00:33:07 -0700

On Tue, 16 Aug 2016 01:43:43 -0300 Gustavo Sverzut Barbieri
<barbi...@gmail.com> said:


> On Mon, Aug 15, 2016 at 11:37 PM, Carsten Haitzler <ras...@rasterman.com>
> wrote:
> > On Mon, 15 Aug 2016 22:35:58 -0300 Gustavo Sverzut Barbieri
> > <barbi...@gmail.com> said:
> >
> >> On Mon, Aug 15, 2016 at 8:13 PM, Carsten Haitzler <ras...@rasterman.com>
> >> wrote:
> >> > On Mon, 15 Aug 2016 12:07:16 -0300 Gustavo Sverzut Barbieri
> >> > <barbi...@gmail.com> said:
> [...]
> >> It is the same, but you do not need to replicate this in every class
> >> like done in Ecore_Exe, Ecore_Con, Ecore_Con_URL... :-)
> >>
> >> I was thinking just like you, but after talking to Tasn a bit I got
> >> what he meant with a "thin wrapper around syscalls" and at the end it
> >> does make sense, more sense actually.
> >
> > if it is just a thin wrapper then what value does it provide?
> 
> Uniform access to the calls.
> 
> Like in Linux, you do get read(2),write(2),close(2) and file
> descriptors to work on almost every basic resource. But when you go to
> higher level resources, like when doing HTTP over libcurl, then you
> cannot call "read(2)" directly...
> 
> With the API I'm proposing you get that simplicity of Unix FD's back.
> It's almost the same call and behavior.
> 
> Then you can write a simple code that monitors a source, see when
> there is data to read, read some data, wait until the destination can
> hold more data, then write it... in a loop. This is the Efl.Io.Copier.
> 
> Check:
> https://git.enlightenment.org/core/efl.git/log/?h=devs/barbieri/efl-io-interfaces
> 
> You will see I already provide Stdin, Stdout, Stderr and File. Those
> are "useless" since you could do with pure POSIX calls. But when I add
> the objects implemented on complex libraries such as cURL, then that
> code will "just work".

unix fd's are NOT - simple. not if you want to be non-blocking. you have to
handle write failures and figure out what was and was not written from your
buffer, handle select() on the when it is available again and write then -
and for all you know you may be able to just write a single byte and then ave
to try again and so on.

unix read/write and fd's push the logic of this up the stack into the app. the
alternative is to do blocking i/o and that is just not viable for something
that multiplexes all its i/o through an event loop.

what i have read of this so far means pushing the "kernel buffer is full, write
failed now or partly failed" back off into the app. and that is not even close
to replacing ecore_con - it fundamentally misses the "i'll buffer that for
you, don't worry about it" nature of it that takes the kernel's limited
buffering and extends it to "infinite" that saves a lot of pain and agony.

> 
> >> >> Efl.Io.Copier does and keeps a "read_chunk" segment that is used as
> >> >> memory for the given slice.
> >> >>
> >> >> This is why the Eina_Slice and Eina_Rw_Slice plays well in this
> >> >> scenario. For example you can get a slice of the given binbuf in order
> >> >> to handle to other functions that will write/read to/from it. It
> >> >> doesn't require any a new binbuf to be created or COW logic.
> >> >
> >> > it requires a new eina slice struct to be allocated that points to the
> >> > data which is EXACTLY the below binbuf api i mention.
> >>
> >> eina slice is a pair of 2 values, will always be. There is no opaque
> >> or need for pointer, or allocate. The eina_slice.h API is mostly about
> >> passing struct value, not reference/pointer.
> >>
> >> with binbuf indeed you're right, given its complexity you end with an
> >> allocated opaque memory handle, magic validation, etc.
> >
> > and that's what a slice is - it's an allocated opaque handle over a blob of
> > memory... is that not just binbuf?
> 
> it's not opaque handle. It's a public structure, you allocate it on
> stack... same cost as doing "const void *x, size_t xlen". But the pair
> is carried, in sync, easy to use, easy to understand.

it'll need to be allocated if you ever have buffering... and the data it points
to will have to be managed.

> >> This can be done transparently with the current API proposal.
> >
> > but in your current one - writes will fail because you don't allocate or
> > expand an existing buffer - right? once full.. then what?
> 
> It's just like read(2)/write(2) that you know very well. If you want
> to copy using them, you need an intermediate buffer.
> 
> Efl.Io.Copier is that code and holds that buffer. You can limit it or not.
> 
> If unlimited, reads() up to a maximum chunk size and keeps expanding
> the buffer. Once write() returns positive value, that amount is
> removed from the buffer, that can shrink.
> 
> If limited, it will stop monitoring read (partially implemented), thus
> will not call read(2), thus will not reach the kernel and eventually
> its internal buffer will be full and the writer process will be
> informed.

so you are FORCING an api that HAS to memcpy() at the time a slice is passed in
before the func returns. that means either it always has to memcpy somewhere
(or has to once writes() start failing when a kernel buffer is full) OR it
requires a blocking api...

what i see here is that you are designing either:

1. a blocking api (unacceptable from any main loop construct)
or
2. an api where writes can fail when buffers are full and that requires the
caller handle buffering and write failures themselves (which makes the api a
pain to use and no better than raw read/write with a raw fd)
or
3. an api that requires a memcpy of data on write ALWAYS once kernel buffers
fill up and no ability to zero copy (which goes against the whole original idea
of you wanting to make it efficient).

:(

> >> >>  read:
> >> >> https://git.enlightenment.org/core/efl.git/diff/src/lib/ecore/efl_io_reader_fd.c?h=devs/barbieri/efl-io-interfaces&id=7895d243bd204ecf986292da4866dd84cceb7c30
> >> >> write:
> >> >> https://git.enlightenment.org/core/efl.git/diff/src/lib/ecore/efl_io_writer_fd.c?h=devs/barbieri/efl-io-interfaces&id=7895d243bd204ecf986292da4866dd84cceb7c30
> >> >>
> >> >> If we convert to handle an Eina_Binbuf *buf to them, we'd need to add
> >> >> few more parameters, like:
> >> >>  - read/write: offset and size, since you may want just a small part
> >> >> of the buffer to be used (like in Efl.Io.Copier if you do
> >> >> line-buffering). Offset and size is all Eina_Slice is about.
> >> >
> >> > how? see below. it already does this. you can wrap a binbuf around any
> >> > address
> >> > + length. it can return the base pointer. the bytes you want are base +
> >> > offset in the "array" returned.
> >>
> >> If you create another binbuf "viewing" (managed + ro) the parent,
> >> okay. However it is not clear to user, such as:
> >>
> >> pass a binbuf to read: how much is it going to read? from 0 to
> >> eina_binbuf_length_get()?  What if eina_binbuf_length_get() == 0? Will
> >> it resize my buffer to read more? How do I limit those? How do I know
> >> how much was read? checking eina_binbuf_length_get()?
> >
> > if you crewate a binbuf from an existing pointer- then yes. it'd give
> > access to just that memory range (though if you go beyond the length of the
> > mem region or before the start,  behaviour is undefined - in  c/c++ as
> > usual). same with slice - right? if you expose a pointer at all... if you
> > only make it work by copies in and out of the interface then it's not zero
> > copy. :)
> 
> slices do not handle memory access at all. It's just exposed, see the
> code. There are some static inline helpers in eina_inline_slice.x to
> save some typing like memchr(), memcpy()... If you're reading from the
> kernel, just use read(fd, slice.mem, slice.len)...
> 
> 
> >> These are conventions we'd need to create and enforce.  With slice
> >> it's plain and thus clear, there is no possibility it would realloc,
> >> copy-on-write, grow... so it's what you passed: mem + size.
> >
> > binbuf can do this too - fail to append for example if its read-only or
> > "adopted ptr".
> 
> could do, but currently AFAIU it will copy-on-write. Thus another
> flag/mode would be needed.
> 
> 
> >> >>  - read/write return parameter "used" with the amount of bytes that
> >> >> were processed, otherwise user needs to store previous
> >> >> eina_binbuf_length_get(), get the new size and compute himself.
> >> >>
> >> >>  - read: a base chunk size, after all we're not doing per-byte calls
> >> >> to read(2) and going with a fixed large enough parameter is kinda of
> >> >> not-optimal.
> >> >>
> >> >> Then you can see the binbuf could be used, I just think they wouldn't
> >> >> be that convenient. I'll keep going like this and convert to Binbuf is
> >> >> Slice is to be disliked once people try to use it for real :-)
> >> >
> >> > they'll be the same. if there is a specific feature missing - then why
> >> > not just add it? i don't see a feature missing. a binbuf can wrap any
> >> > arbitrary blob of bytes.
> >>
> >> currently the only features missing I see are simple to implement,
> >> however the usage/clarify may not be fixed.
> >>
> >> features missing would be some kind of "expand" or "resize" the
> >> backing buffer without "append()". This could be used to expand buffer
> >> in-place then use a pointer to it with other calls (such as read(2)).
> >
> > so question... where would you need this for efl.net? other than just
> > filling a single binbuf with data either on the "app" side or inside
> > efl.net when reading a socket... - both of which can just alloc a buffer -
> > write n bytes to it then create a new binbuf around that ptr (and binbuf
> > can handle freeing it later too
> > - maybe you are missing a custom "free/release" func in binbuf)
> 
> that free/release is one of the missing bits, but I'm not even talking
> about those at this moment. It's on the usability side.
> 
> it would help if you check the branch I'm pointing.
> 
> > the reason i keep going on about this is... i see slice right now as simply
> > duplicating binbbuf and thus i see it as adding to learning curves ALSO
> > adding to the cost of writing manual bindings for efl for a specific
> > language target etc. etc.
> 
> Take a look a the code I've uploaded, there are some tests for
> eina_slice and some basic objects were changed to use/return it.
> 
> Take all the code named after "append_length(mem, size)"... these
> could all be "append_slice(slice)". It makes it easy to understand
> these are related. It's easier to write bindings.
> 
> Same on return. Take eina_binbuf_string_get()... it's not usable
> without eina_binbuf_length_get(), since the binbuf is binary, so you
> really need the memory AND the length.
> 
> Thus eina_binbuf_slice_get() is cleaner and easier to use (not to say
> that reduces the API size if we ignore legacy).
> 
> 
> 
> >> eventually "ro" flags (bool) could be changed to some more conditions,
> >> such as fixed capacity (no realloc), this can be used when you want to
> >> avoid reallocs() on reset, remove or append.
> >>
> >> bottom line is: feature wise, it's easy to implement. But the number
> >> of features and lack of clarity on how it's supposed to be used is
> >> what's bothering me :-/
> >
> > then maybe documentation and sample code will make it clear. the SIMPLE
> > usage is:
> >
> > on writes:
> >   1. ask efl.net to create a write buffer.
> >   2. append to write buffer (binbuf_append - yes its a copy!)
> >   3. send/submit the binbuf for sending (efl.net adds toa  list of pending
> > to write binbufs
> >   4. inside efl.net the binbuf is freed (or returned toa  pool of buffers
> > ready to re-use)
> 
> In a way, this is what Efl.Io.Copier is, not tied to network of course.
> 
> something to consider in the list-of-binbuf above is that binbuf grows
> to avoid reallocs, so depending on the usage it can hit high rate of
> unused memory.
> 
> then I'm using a single binbuf right now, can measure and see if
> that's a performance hit in the future.
> 
> 
> > on reads:
> >   1. efl.net creates a binbuf of UP to N bytes in size
> >   2. efl.net read()s into this binbuf up to the max length
> >   3. if length shorter, set binbuf len to be this shortened length
> >   4. append read binbuf to "read queue"
> >   5. walk through read queue calling callbacks on read buffers
> >   6. once callback has been called, return binbuf to the pool
> 
> https://git.enlightenment.org/core/efl.git/tree/src/lib/ecore/efl_io_copier.c?h=devs/barbieri/efl-io-interfaces
> sounds familiar? ;-) (just remember it's using a single binbuf and
> still not reading directly to it as it should, there is a TODO for
> that in that file).
> 
> [...]
> 
> >> >> > one way or another i don't think you can avoid allocating some object
> >> >> > that represents a blob of data. if its a slice, or a binbuf - it
> >> >> > doesn't matter. my point here is - create a binbuf and PASS It IN as
> >> >> > a whole. it's immutable once passed to efl.net or when efl.net passes
> >> >> > it back to the caller in events. the binbuf size is as large as is
> >> >> > needed to transport that piece of data.
> >> >> >
> >> >> > all your slice is is a binbuf under another name here. once you fix it
> >> >> > and do it right and allocate/free when done.
> >> >>
> >> >> the slice is just a view of memory, it doesn't free/allocate/grow.
> >> >
> >> > you have to allocate the Eina_Slice * struct. either way. if you want to
> >> > return one. eina_binbuf_manage_new_length() already is there and does
> >> > this. you are creating more api's that do the same thing. we have enough
> >> > api's in efl that duplicate functionality. explain how slice is
> >> > different. you still have to alloc the slice STRUCT. not the data
> >> > buffer. the struct that wraps the data buffer. that is precisely what
> >> > the above already does.
> >>
> >> see the code, there is no slice allocation, the struct is passed as
> >> value. It's plain clear that it's simply memory and its size, no
> >> copy-on-write, no realloc, no alloc, no free.
> >>
> >> It's the same as passing "void *, size_t", but you force these to be
> >> linked, hinting their relationship, allowing bindings and the likes to
> >> map to them.
> >
> > sure - you bound ptr+size in a datatype. you will have to ALLOCATE these if
> > you are to have any buffering. then what about the backing data  behind the
> > slice? how can you keep it around when buffered? the slice doesnt have any
> > ownership of the data pointed to...
> 
> this is right, it doesn't have any ownership. The buffering is outside
> of it, it's just used to refer to the buffered regions we want to deal
> with, like the slice we want to append, the slice we want to write,
> the slice we want to read...
> 
> 
> > unless you intend to have no buffering. everything is just simple direct to
> > kernel read/write and if a write fails - pass back that failure, OR make
> > writes block. ?
> 
> See the explanations and code :-) the buffering is handled at another level.
> 
> 
> 
> -- 
> Gustavo Sverzut Barbieri
> --------------------------------------
> Mobile: +55 (16) 99354-9890
> 


-- 
------------- Codito, ergo sum - "I code, therefore I am" --------------
The Rasterman (Carsten Haitzler)    ras...@rasterman.com


------------------------------------------------------------------------------
_______________________________________________
enlightenment-devel mailing list
enlightenment-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

Re: [E-devel] [RFC] Eina_Slice and Eina_Rw_Slice

Reply via email to