On Fri, Jan 18, 2013 at 12:34 AM, Gustavo Sverzut Barbieri <barbi...@profusion.mobi> wrote: > On Thu, Jan 17, 2013 at 12:15 PM, Cedric BAIL <cedric.b...@free.fr> wrote: >> On Thu, Jan 17, 2013 at 10:34 PM, Gustavo Sverzut Barbieri >> <barbi...@profusion.mobi> wrote: >>> On Thursday, January 17, 2013, Cedric BAIL wrote: >>>> On Thu, Jan 17, 2013 at 9:37 PM, Gustavo Sverzut Barbieri >>>> <barbi...@profusion.mobi <javascript:;>> wrote: >>>> > On Thursday, January 17, 2013, Cedric BAIL wrote: >>>> >> On Thu, Jan 17, 2013 at 7:28 PM, Gustavo Sverzut Barbieri >>>> >> <barbi...@profusion.mobi <javascript:;> <javascript:;>> wrote: >>>> >> > On Thursday, January 17, 2013, Enlightenment SVN wrote: >>>> >> >> Log: >>>> >> >> efl: stupid micro optimization. >>>> >> >> >>>> >> >> This single test accounted for 1% of my terminology benchmark. >>>> >> >> I am considering moving evas_string_char_next_get and >>>> >> >> eina_unicode_utf8_get_next to become inline as their function >>>> >> >> entry/exit point account for 3% of the same benchmark. >>>> >> >> >>>> >> >> The biggest win would be to get rid of the memcpy >>>> _termpty_text_copy >>>> >> >> that account for 16%. >>>> >> >> >>>> >> >> In the micro optimization part, we also still do to much malloc >>>> >> >> in font_draw_prepare as we don't recycle the array there and >>>> account >>>> >> >> for 3% of the benchmark in malloc/free there. In the same ballpark >>>> >> >> _text_save_top account for 2% of the time in malloc/free. >>>> >> >> >>>> >> >> In that same benchmark, evas_object_textgrid_render account for 5% >>>> >> >> where 4% of its time is spend in evas_common_font_draw_prepare. At >>>> >> this >>>> >> >> point I am not sure that rewriting textgrid is gona help us at >>>> all. We >>>> >> >> will win almost as much by just inlining the get_next things in >>>> evas >>>> >> >> and eina for a minute of development time. >>>> >> > >>>> >> > It's a bit naive to think so, because you'd be able to change the >>>> >> algorithm >>>> >> > and avoid conversions. All in all you could just give engine the same >>>> >> array >>>> >> > that terminology fills (cell row array), together with region and >>>> context >>>> >> > (clipper, cutouts) and glyph bitmap. >>>> >> > >>>> >> > Particularly the glyph bitmap could be optimized as its an int hash, >>>> but >>>> >> we >>>> >> > know A-Za-z0-9 ate hot, we could have ASCII printable range in an >>>> array >>>> >> > while everything else goes to a hash >>>> >> >>>> >> Time spend in evas_common_font_draw 4%. Time spend in >>>> >> evas_object_textgrid_*: > 2%. >>>> >> Time spend in _cb_fb_read : 82% with evas_string_char_next_get being >>>> >> 15% and memcpy 18%. >>>> >> >>>> >> I don't see how optimizing textgrid is going to change this number at >>>> all. >>>> > >>>> > You'd not be doing these at all. That's why/how. >>>> >>>> What do you mean by "these" ? I am going to redo the benchmark with >>>> perf on another computer to see if there is something else. >>> >>> >>> These calls. If you do the way I imagine you give the screen (array of >>> cells) to the render much like the way you give pixels to image. There is >>> no need to call a bunch of things, like no need to create the text props >>> and such. All you need is the cell info (palette index and attributes), the >>> palette contents and a way to convert Unicode -> bitmap. >> >> All the cost of this 4% goes into the blitting, that you can't win. So >> you are basically going after a 2% improvement, that you can maybe >> reduce a little but not much as we don't create the text props at all >> in more than 99% of the case and the text props is a mapping from >> unicode to bitmap that are recycled very often. I really think you are >> focusing on the wrong thing here. > > You can allow engine to cache not only the glyphs, but the cells. > (with bg/font colors, translating into pure image blit). Text grid is > pretty cache friendly for one alphabet, and for terminology that will > definitely hit ASCII as user data may be outside it, but compile > output, commands and similar are all ASCII :-) You can have an array > of printable elements and access them in O(1), no translation.
Now, I get it ! Ok, we could basically save maybe a half of this 2% and move the rest into the rendering thread (as it is still required to walk over the grid). So I now do understand your idea. > And if you do such, the grid contents needs no special handling. It > will be like an image, you can think about an image with different > colorspace :-) > > as for the _cb_fd_read(), indeed it's outside of the concept of > textgrid... and looking at it it's pretty obvious that should > optimizations could be applied, simply by changing how it converts the > chars to unicode, then handle it to handle_buf just to grow its buffer > and copy. The (n == 0) block can also be simplified and optimized at > the same time :-) That's where we should focus. Once done there, optimizing the textgrid may give us some speedup, but right now this part is hiding everything. -- Cedric BAIL ------------------------------------------------------------------------------ Master HTML5, CSS3, ASP.NET, MVC, AJAX, Knockout.js, Web API and much more. Get web development skills now with LearnDevNow - 350+ hours of step-by-step video tutorials by Microsoft MVPs and experts. SALE $99.99 this month only -- learn more at: http://p.sf.net/sfu/learnmore_122812 _______________________________________________ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel