Re: [E-devel] Shared Strings

The Rasterman Thu, 08 May 2008 01:39:02 -0700

On Wed, 7 May 2008 20:55:40 -0500 "Nathan Ingersoll" <[EMAIL PROTECTED]>
babbled:

after. i've run the tests myself now.

i'll add some stats. e17 maintains about 3000-4000 unique strings in
evas_stringshare. in my tests. i also checked on number of adds, dels and
lookups in evas_stringshare usage. about 6.48% of calls are for adds of a
unique string. 5.12% are for deletes of a string. 88.41% are accesses (an add
or a del of an existing string where the operation does not free or allocate
any memory but just refcounts up or down).

now based on this, the existing stringshare vs. string_instance means, that if
you account for the relative usage of add, del and access paths,
string_instance gets overall faster once you have about 3200 or so strings. so
e17 is about at the cusp point. it's also one of the heavier users i imagine.

i simply changed hash bucket size to be 1024 items instead of 256 and that
makes the crossover point at about 7200 strings items - well beyond normal
usage of e17 anyway. adds and delets are still significantly faster (string
instance takes 1.8 and 1.4 times the time respectively compared to stringshare)
even at 10,000 strings. with the 1024 buckets for stringshare of course. but
string_instance takes 0.8 times the time for lookups.

overall it's a close race. i'll try improve stringshare a little and see what i
get, but beyond making it have dynamic bucket sizes (like ecore_hash) it isn't
likely to go far. a dynamic bucket size will mean it will scale very high
(question: do we need it to go that high?) but the idea of having to re-do the
bucket array at certain points is a little uncomfortable (so u go from 3000 to
3001 and the system has to spend extra cycles re-packing all hash items in a new
bucket set thats bigger). yes - this is nicer in terms of base mem usage of
course. so the thing here is to figure out how to get the best thing we can for
the least cost. i do know stringshare has a 1 alloc per unique string overhead.
thats about as small as it gets (also either 8 or 12 bytes (32 or 64bit) per
unique string for refcount and pointer).

now... more interestingly - i now started looking at the test. it is very
artificial. at most 2 copies of a string, so n repeated adds, and all short
strings. not very representative of common usage. in common usage i have
seen some strings with refcounts of > 200. in fact the del wont work with
stringshare. on del u need to supply the actual string pointer - not the
snprintf'd buffer. so nothing gets found and deleted. ie its meant to work with:

char *s;

s = evas_stringshare_add("string");
...
evas_stringshare_del(s);

ie - the same return from stringshare.
evas_stringshare_del("string") will not work.

... so as such you need a test that is more representative of actual usage.

so that's just what i did. i literally logged all stringshare add's, dels etc.
in such a way it'd produce "correct code" from a session of e17 i fiddled with
for about 5 minutes doing stuff. you'll forgive me for not including the code
as the .c file generated is 11m (239,000 lines of c) that i included into the
compare.c infra to test both ecore and evas code and just time it doing exactly
what e does. i also included "nops" where functions are called, but do nothing
so we can remove the simple test harness and function call overhead and compare
just core. as it was a little too fast i make it run 1000 loops of what e
actually does one after the other (yes it'll be bad as it doesn't start with a
clean slate but better than nothing). result for 1000 iterations one after the
other:

evas: 20.691495
ecore: 30.510302
nops: 3.444793

real factor: 1.57

so really 20.69 - 3.44 vs 30.51 - 3.44 - i.e 17.25 vs 27.07 (evas being the
lower). yes - this means a lot of things will get high refcounts as things get
re-added a lot and then not removed, so the raw results of only 1 iteration:

evas: 0.031672
ecore: 0.045482
nops: 0.004831

real factor: 1.51

not as accurate as the times are so small, but the same order of magnitude as
above.

so as such... if we are doing benchmarks to know which implementation to use to
find the one with best results - at least for the case of e17, evas is the
winner here. of course if you think e17's use case is pretty atypical and you
need another one, we should continue to check.

comments. ?

> Were these numbers from ecore before or after cedric's changes this morning?
> 
> On Wed, May 7, 2008 at 1:20 PM, Peter Wehrfritz <[EMAIL PROTECTED]>
> wrote:
> > Yesterday we had a discussion on irc, if we should put abstract data
> >  types of ecore and of evas into a single standalone lib. The whole
> >  discussion came up because of the two implementations of the shared
> >  strings. And in fact if we really want to share strings efficient, we
> >  have to share them over the borders of the different libraries.
> >
> >  Raster's idea was to first put the shared string stuff in this new
> >  library because both implementation have the same api (of course the
> >  names are different) and the same functionality. Remains the question
> >  which implementation we use.
> >
> >
> >  Therefor I've written a small test application, to measure the time it
> >  takes to create new strings, access new strings and to delete them.
> >
> >  You can find the program here:
> >  mowem.de/ecore/compare.c
> >
> >  And here a here the plot of the result
> >  mowem.de/ecore/result_direct.ps
> >
> >  In short words, since evas uses a static bucket count it has a very good
> >  performance for few strings, for many ecore has a good access time, but
> >  still pays the price for the reordering of the increased or decreased
> >  bucket count.
> >
> >  Peter
> >
> >  -------------------------------------------------------------------------
> >  This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
> >  Don't miss this year's exciting event. There's still time to save $100.
> >  Use priority code J8TL2D2.
> >  
> > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
> >  _______________________________________________
> >  enlightenment-devel mailing list
> >  [email protected]
> >  https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
> >
> 
> -------------------------------------------------------------------------
> This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
> Don't miss this year's exciting event. There's still time to save $100. 
> Use priority code J8TL2D2. 
> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
> _______________________________________________
> enlightenment-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
> 

-- 
------------- Codito, ergo sum - "I code, therefore I am" --------------
The Rasterman (Carsten Haitzler)    [EMAIL PROTECTED]

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
enlightenment-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

Re: [E-devel] Shared Strings

Reply via email to