On Fri, Jan 18, 2013 at 12:34 AM, Gustavo Sverzut Barbieri
<barbi...@profusion.mobi> wrote:
> On Thu, Jan 17, 2013 at 12:15 PM, Cedric BAIL <cedric.b...@free.fr> wrote:
>> On Thu, Jan 17, 2013 at 10:34 PM, Gustavo Sverzut Barbieri
>> <barbi...@profusion.mobi> wrote:
>>> On Thursday, January 17, 2013, Cedric BAIL wrote:
>>>> On Thu, Jan 17, 2013 at 9:37 PM, Gustavo Sverzut Barbieri
>>>> <barbi...@profusion.mobi <javascript:;>> wrote:
>>>> > On Thursday, January 17, 2013, Cedric BAIL wrote:
>>>> >> On Thu, Jan 17, 2013 at 7:28 PM, Gustavo Sverzut Barbieri
>>>> >> <barbi...@profusion.mobi <javascript:;> <javascript:;>> wrote:
>>>> >> > On Thursday, January 17, 2013, Enlightenment SVN wrote:
>>>> >> >> Log:
>>>> >> >> efl: stupid micro optimization.
>>>> >> >>
>>>> >> >>   This single test accounted for 1% of my terminology benchmark.
>>>> >> >>   I am considering moving evas_string_char_next_get and
>>>> >> >>   eina_unicode_utf8_get_next to become inline as their function
>>>> >> >>   entry/exit point account for 3% of the same benchmark.
>>>> >> >>
>>>> >> >>   The biggest win would be to get rid of the memcpy
>>>> _termpty_text_copy
>>>> >> >>   that account for 16%.
>>>> >> >>
>>>> >> >>   In the micro optimization part, we also still do to much malloc
>>>> >> >>   in font_draw_prepare as we don't recycle the array there and
>>>> account
>>>> >> >>   for 3% of the benchmark in malloc/free there. In the same ballpark
>>>> >> >>   _text_save_top account for 2% of the time in malloc/free.
>>>> >> >>
>>>> >> >>   In that same benchmark, evas_object_textgrid_render account for 5%
>>>> >> >>   where 4% of its time is spend in evas_common_font_draw_prepare. At
>>>> >> this
>>>> >> >>   point I am not sure that rewriting textgrid is gona help us at
>>>> all. We
>>>> >> >>   will win almost as much by just inlining the get_next things in
>>>> evas
>>>> >> >>   and eina for a minute of development time.
>>>> >> >
>>>> >> > It's a bit naive to think so, because you'd be able to change the
>>>> >> algorithm
>>>> >> > and avoid conversions. All in all you could just give engine the same
>>>> >> array
>>>> >> > that terminology fills (cell row array), together with region and
>>>> context
>>>> >> > (clipper, cutouts) and glyph bitmap.
>>>> >> >
>>>> >> > Particularly the glyph bitmap could be optimized as its an int hash,
>>>> but
>>>> >> we
>>>> >> > know A-Za-z0-9 ate hot, we could have ASCII printable range in an
>>>> array
>>>> >> > while everything else goes to a hash
>>>> >>
>>>> >> Time spend in evas_common_font_draw 4%. Time spend in
>>>> >> evas_object_textgrid_*: > 2%.
>>>> >> Time spend in _cb_fb_read : 82% with evas_string_char_next_get being
>>>> >> 15% and memcpy 18%.
>>>> >>
>>>> >> I don't see how optimizing textgrid is going to change this number at
>>>> all.
>>>> >
>>>> > You'd not be doing these at all. That's why/how.
>>>>
>>>> What do you mean by "these" ? I am going to redo the benchmark with
>>>> perf on another computer to see if there is something else.
>>>
>>>
>>> These calls. If you do the way I imagine you give the screen (array of
>>> cells) to the render much like the way you give pixels to image. There is
>>> no need to call a bunch of things, like no need to create the text props
>>> and such. All you need is the cell info (palette index and attributes), the
>>> palette contents and a way to convert Unicode -> bitmap.
>>
>> All the cost of this 4% goes into the blitting, that you can't win. So
>> you are basically going after a 2% improvement, that you can maybe
>> reduce a little but not much as we don't create the text props at all
>> in more than 99% of the case and the text props is a mapping from
>> unicode to bitmap that are recycled very often. I really think you are
>> focusing on the wrong thing here.
>
> You can allow engine to cache not only the glyphs, but the cells.
> (with bg/font colors, translating into pure image blit). Text grid is
> pretty cache friendly for one alphabet, and for terminology that will
> definitely hit ASCII as user data may be outside it, but compile
> output, commands and similar are all ASCII :-) You can have an array
> of printable elements and access them in O(1), no translation.

Now, I get it ! Ok, we could basically save maybe a half of this 2%
and move the rest into the rendering thread (as it is still required
to walk over the grid). So I now do understand your idea.

> And if you do such, the grid contents needs no special handling. It
> will be like an image, you can think about an image with different
> colorspace :-)
>
> as for the _cb_fd_read(), indeed it's outside of the concept of
> textgrid... and looking at it it's pretty obvious that should
> optimizations could be applied, simply by changing how it converts the
> chars to unicode, then handle it to handle_buf just to grow its buffer
> and copy. The (n == 0) block can also be simplified and optimized at
> the same time :-)

That's where we should focus. Once done there, optimizing the textgrid
may give us some speedup, but right now this part is hiding
everything.
--
Cedric BAIL

------------------------------------------------------------------------------
Master HTML5, CSS3, ASP.NET, MVC, AJAX, Knockout.js, Web API and
much more. Get web development skills now with LearnDevNow -
350+ hours of step-by-step video tutorials by Microsoft MVPs and experts.
SALE $99.99 this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122812
_______________________________________________
enlightenment-devel mailing list
enlightenment-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

Reply via email to