Hey,

As discussed on IRC: all look like promising ideas, just please make
sure to benchmark both speed and memory usage to make sure there are no
regressions.

The eo benchmark suite should already be fairly complete for speed, but
there are no benchmarks for memory usage if memory serves. I'd also have
a look to make sure the tests still work as expected given that they've
only been touched twice since I last worked on Eo in 2016.

Bottom line: looks promising, but please test, test and test. Eo is
*the* hot-path of the EFL, even the smallest changes can have severe
impacts.

--
Tom

On 18/02/2020 16:45, Marcel Hollerbach wrote:
> Hi,
> 
> here a little draft of what i will work on beginning on the 10th of
> march. So until there, its time for discussion.
> 
> == Introduction ==
> Eo is a OOP framework implementation. Classes are basically
> implementations of a set of interfaces / parent class / mixins. Each
> Interface / parent class / mixin, consits of a set of functions.
> The API a Class provides is called the absolut set of functions of a class
> Each element in the absolut set of functions has to have a
> implementation. Which has to be stored somewhere, this piece of code and
> structure is called vtable. Every entry in the vtable consits of the
> implementation of a API, and the source Class. (Which is needed for
> privat data calculation, which is another topic)
> 
> Right now, each function has a ID (fid), this id is globally unique, and
> just incremented for each and every function that gets registered to EO.
> Each vtable of a class has the space to store *all* functions that have
> been registered to the current point.
> Right now this happens in a way where the function ID of each function
> is split into a upper part and a lower part (based on the bit
> representation), this then results in the dich-chain1 ID (dc1) and
> dich-chain2 ID (dc2) which are used to get to the correct slot (pseudo
> code like: vtable->dichchain1[dc1]->dichchain2[dc2]). If a class needs
> to store the function X, then dich-chain1 at space dc1 gets a allocated
> dichchain2, which consists of 0's expect in the slot dc2, there we store
> the implementation and the source.
> 
> == The problem ==
> The problem with this approach is that we waste quite a lot of memory,
> esp. for widgets that do not implement a lot of API which have a close
> fid, leading to the fact that we allocate a lot of dich chain2s where
> only a few slots are really used. If you go and messure how many slots
> we allocate and how many we use we are having a mean value of round
> about 0.36, which is quite bad IMO. In total, we have 35296 slots
> allocated, and we use 16807. (This is already honoring the eo
> optimizations we have, the referenced slots are NOT added to this)
> 
> == The Idea number 1 ==
> What we can do to improve this is: we change the heuristic how we
> allocate fids. We could increment for each class/interface/mixin one dc1
> counter, and for each function in this class/interface/mixin we
> increment a class/interface/mixin privat counter, which will be the dc2.
> We then combine them together to the fid via something like
> dc1*10000+dc2. When a eo call then happens, we simply decompose the fid
> like we do it now, and call the two dich chains, like we do it now.
> What this changes in the memory layout is that we fully use each and
> every dichchain2 that we allocate.
> 
> The downside of that method is that we get a longer dichchain1 array,
> right now with elementary_test we are having 32 pointer long dichchain1
> at max, meaning we are allocating in total 2KB over all classes we have.
> 
> With this new idea the dichchain1 will be 190 elements long meaning 70KB
> over all classes we have.
> However, due to the savings in the dichchain2's we are saving round
> about 144KB of allocated data, (1 slot is 8byte). Which means we are
> still having a safe of round about 74KB.
> 
> This is something which I would implement, additional min / max
> checkings or COW on the dichchain1 are probably even saving more.
> 
> == The Idea number 2 ==
> After Idea 1 is implemented, we could work on this.
> Right now we are allocating one dc1 slot *per* class we have. Right now
> we have 26 mixins 90 regulars 13 abstracts and 61 Interfaces (Summing up
> to 190). Which results in the length of the dc1. To improve the
> situation we could go and say that each dc1 ID of a regular class is the
> max of all functions including those of the parent class + 1. That
> means, that fid's are not globally unique anymore, but they are still
> unique within the type they are defined.
> The more interesting thing is, that all regular APIs are now in the same
> dc1 slot, which means, it is enough to allocate the size of the dc1 as
> the size of the sum of iterfaces and mixins, ending up with something
> lower than 15KB per class. Bringing us to saving 130KB of allocated data
> (compared to the state that we have right now).
> 
> Another "downside" of that approach is also that we loose the ability of
> multiclass inheritance, however, we only have one widget that uses that
> ability, and this does not work anyways, so we could just deprecate it i
> think.
> 
> Funcy side effect of this idea: if we resolve a regular API call to a
> object, the dc1[0] slot will always be the same. That means, we could
> store that pointer in parallel to the vtable pointer, saving us one
> indirection PER call resolve. (Right now we have 2).
> 
> Any ideas, thoughts about this ? I will start the work on that on the
> 10.03.2020 :)
> 
> Greetings,
>    bu5hm4n
> 
> PS: you can find a plot about our usage ratio in EO in the attached plot.
> 
> 
> _______________________________________________
> enlightenment-devel mailing list
> enlightenment-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
> 


_______________________________________________
enlightenment-devel mailing list
enlightenment-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

Reply via email to