Re: Optimizations questions

Jukka Jylänki Tue, 24 Jan 2017 17:46:24 -0800

If that inner loop is run very often, the first suspect I'd have are the
sin() and cos() trigonometric functions there, which don't exist in asm.js
but cause an FFI to jump outside asm.js module and to browser Math
functions, which are generally very tuned towards precision and not
optimised for speed at all. See
https://github.com/juj/MathGeoLib/blob/00730d87cb3ca069187cef2cc0eea4b3cd084e09/src/Math/MathFunc.cpp#L65
for the implementation I went with. That'll allow removing the if
(curSkewX/Y == 0) checks as well. Not sure if that's the cause, but worth a
try. Another one might be if there are lots of tightly packed structs
(animation code often has such for keyframes), which cause unaligned memory
loads and stores. For those, wasm should provide a nice speedup in x86.


2017-01-20 20:50 GMT+02:00 Alon Zakai <alonza...@gmail.com>:

> Thanks for the source code, but I'd need something that can be compiled by
> itself in order to work on it. Also with as much of the surrounding code
> removed as possible, so that only the actually relevant code is left.
>
> On Fri, Jan 20, 2017 at 2:21 AM, Sergey Solozhentsev <
> sergeysolozhent...@gmail.com> wrote:
>
>> I've laready optimized this function but issue still exists. This is
>> javascript code
>> function bG(d, e) {
>>         d = d | 0;
>>         e = e | 0;
>>         var f = 0,
>>             h = 0,
>>             j = 0,
>>             k = 0,
>>             l = 0.0,
>>             m = 0.0,
>>             n = 0.0,
>>             o = 0.0,
>>             p = 0.0,
>>             q = 0.0,
>>             r = 0.0,
>>             s = 0,
>>             t = 0,
>>             u = 0,
>>             v = 0,
>>             w = 0.0,
>>             x = 0,
>>             y = 0,
>>             z = 0,
>>             A = 0,
>>             B = 0,
>>             C = 0,
>>             D = 0,
>>             E = 0,
>>             F = 0,
>>             G = 0,
>>             H = 0,
>>             I = 0,
>>             J = 0,
>>             K = 0.0;
>>         J = i;
>>         i = i + 80 | 0;
>>         E = J + 60 | 0;
>>         y = J + 48 | 0;
>>         s = J + 56 | 0;
>>         t = J + 52 | 0;
>>         x = J + 24 | 0;
>>         D = J + 16 | 0;
>>         z = J + 12 | 0;
>>         A = J + 8 | 0;
>>         B = J + 4 | 0;
>>         C = J;
>>         H = J + 64 | 0;
>>         v = c[d + 4 >> 2] | 0;
>>         u = (c[v + 16 >> 2] | 0) + -1 | 0;
>>         v = v + 12 | 0;
>>         k = c[v >> 2] | 0;
>>         if ((c[(c[k + (u << 2) >> 2] | 0) + 92 >> 2] | 0) >>> 0 < e >>>
>> 0) {
>>             i = J;
>>             return
>>         }
>>         I = d + 12 | 0;
>>         f = c[I >> 2] | 0;
>>         a: do
>>                 if (f >>> 0 < u >>> 0)
>>                     while (1) {
>>                         h = f + 1 | 0;
>>                         if ((c[(c[k + (h << 2) >> 2] | 0) + 92 >> 2] | 0)
>> >>> 0 > e >>> 0) break a;
>>                         c[I >> 2] = h;
>>                        * if (h >>> 0 < u >>> 0) f = h;*
>> *                        else {*
>> *                            f = h;*
>> *                            break*
>> *                        }*
>>                     }
>>             while (0);
>>             h = c[k + (f << 2) >> 2] | 0;
>>         j = h + 92 | 0;
>>         if ((c[j >> 2] | 0) >>> 0 > e >>> 0) {
>>             do {
>>                 f = f + -1 | 0;
>>                 h = c[k + (f << 2) >> 2] | 0;
>>                 j = h + 92 | 0
>>             } while ((c[j >> 2] | 0) >>> 0 > e >>> 0);
>>             c[I >> 2] = f;
>>             F = h;
>>             f = j
>>         } else {
>>             F = h;
>>             f = j
>>         }
>>         G = c[d + 8 >> 2] | 0;
>>         G = xk[c[(c[G >> 2] | 0) + 8 >> 2] & 255](G) | 0;
>>         if (!(a[F + 88 >> 0] | 0)) {
>>             h = d + 16 | 0;
>>             f = c[I >> 2] | 0;
>>             if ((c[h >> 2] | 0) == (f | 0)) {
>>                 i = J;
>>                 return
>>             }
>>             c[h >> 2] = f;
>>             CQ(G, F + 60 | 0);
>>             HN(s, c[F + 48 >> 2] | 0);
>>             c[E >> 2] = c[s >> 2];
>>             qQ(G, E);
>>             rQ(G, F + 52 | 0);
>>             i = J;
>>             return
>>         }
>>         h = c[f >> 2] | 0;
>>         f = c[I >> 2] | 0;
>>         if ((h | 0) == (e | 0)) {
>>             c[d + 16 >> 2] = f;
>>             CQ(G, F + 60 | 0);
>>             HN(t, c[F + 48 >> 2] | 0);
>>             c[E >> 2] = c[t >> 2];
>>             qQ(G, E);
>>             rQ(G, F + 52 | 0);
>>             i = J;
>>             return
>>         }
>>         if ((f | 0) == (u | 0)) s = F;
>>         else s = c[(c[v >> 2] | 0) + (f + 1 << 2) >> 2] | 0;
>>         v = c[F + 84 >> 2] | 0;
>>         w = +Uk[c[(c[v >> 2] | 0) + 12 >> 2] & 3](v, e - h | 0, (c[s + 92
>> >> 2] | 0) - h | 0);
>>         l = +g[s + 8 >> 2];
>>         m = +g[F + 8 >> 2];
>>         if (l != m) m = m + w * (l - m);
>>         l = +g[s + 12 >> 2];
>>         n = +g[F + 12 >> 2];
>>         if (l != n) n = n + w * (l - n);
>>         o = +g[F + 24 >> 2];
>>         l = +g[F + 16 >> 2];
>>         o = o == 0.0 ? l : l + w * o;
>>         l = +g[F + 28 >> 2];
>>         r = +g[F + 20 >> 2];
>>         l = l == 0.0 ? r : r + w * l;
>>         if (l == 0.0) r = 0.0;
>>         else {
>>             q = l;
>>             r = -(n * +S(+q));
>>             n = n * +R(+q)
>>         }
>>         g[x + 8 >> 2] = r;
>>         g[x + 12 >> 2] = n;
>>         if (o == 0.0) {
>>             q = m;
>>             p = 0.0
>>         } else {
>>             p = o;
>>             q = m * +R(+p);
>>             p = m * +S(+p)
>>         }
>>         g[x >> 2] = q;
>>         g[x + 4 >> 2] = p;
>>         l = +g[s >> 2];
>>         m = +g[F >> 2];
>>         if (l != m) m = m + w * (l - m);
>>         o = +g[s + 4 >> 2];
>>         l = +g[F + 4 >> 2];
>>         if (o != l) l = l + w * (o - l);
>>         h = x + 16 | 0;
>>         g[h >> 2] = m;
>>         f = x + 20 | 0;
>>         g[f >> 2] = l;
>>         K = +g[F + 32 >> 2];
>>         o = +g[F + 36 >> 2];
>>         g[h >> 2] = m + (m - (m + (q * K + r * o)));
>>         g[f >> 2] = l + (l - (l + (K * p + o * n)));
>>         CQ(G, x);
>>         f = c[F + 48 >> 2] | 0;
>>         h = s + 48 | 0;
>>         if ((f | 0) == (c[h >> 2] | 0)) {
>>             if ((c[I >> 2] | 0) != (c[d + 16 >> 2] | 0)) {
>>                 HN(C, f);
>>                 c[E >> 2] = c[C >> 2];
>>                 qQ(G, E)
>>             }
>>         } else {
>>             HN(z, f);
>>             HN(A, c[h >> 2] | 0);
>>             c[y >> 2] = c[z >> 2];
>>             c[E >> 2] = c[A >> 2];
>>             aG(D, y, E, w);
>>             c[B >> 2] = c[D >> 2];
>>             c[E >> 2] = c[B >> 2];
>>             qQ(G, E)
>>         }
>>         k = F + 52 | 0;
>>         j = s + 52 | 0;
>>         if (!(vN(k, j) | 0)) {
>>             f = d + 16 | 0;
>>             if ((c[I >> 2] | 0) != (c[f >> 2] | 0)) rQ(G, k)
>>         } else {
>>             f = b[s + 58 >> 1] | 0;
>>             h = b[F + 58 >> 1] | 0;
>>             if (f << 16 >> 16 != h << 16 >> 16) f = ~~(+(h << 16 >> 16) +
>> w * +((f << 16 >> 16) - (h << 16 >> 16) | 0));
>>             b[H + 6 >> 1] = f;
>>             f = b[j >> 1] | 0;
>>             h = b[k >> 1] | 0;
>>             if (f << 16 >> 16 != h << 16 >> 16) f = ~~(+(h << 16 >> 16) +
>> w * +((f << 16 >> 16) - (h << 16 >> 16) | 0));
>>             b[H >> 1] = f;
>>             f = b[s + 54 >> 1] | 0;
>>             h = b[F + 54 >> 1] | 0;
>>             if (f << 16 >> 16 != h << 16 >> 16) f = ~~(+(h << 16 >> 16) +
>> w * +((f << 16 >> 16) - (h << 16 >> 16) | 0));
>>             b[H + 2 >> 1] = f;
>>             f = b[s + 56 >> 1] | 0;
>>             h = b[F + 56 >> 1] | 0;
>>             if (f << 16 >> 16 != h << 16 >> 16) f = ~~(+(h << 16 >> 16) +
>> w * +((f << 16 >> 16) - (h << 16 >> 16) | 0));
>>             b[H + 4 >> 1] = f;
>>             rQ(G, H);
>>             f = d + 16 | 0
>>         }
>>         c[f >> 2] = c[I >> 2];
>>         i = J;
>>         return
>>     }
>>
>> and C++ code is
>>
>> void CAnimationData::updateForFrame(size_t frame)
>> {
>> const auto prototype = (CAnimationDataPrototype*)m_prototype;
>> const size_t keyFrameSize = prototype->m_keyframes.size() - 1;
>> // XXX hack for double speed animations
>> if (frame > prototype->m_keyframes[keyFrameSize]->index)
>> {
>> return;
>> }
>>
>> while (m_currentKeyFrame < keyFrameSize &&
>> frame >= prototype->m_keyframes[m_currentKeyFrame + 1]->index)
>> {
>> ++m_currentKeyFrame;
>> }
>> while (frame < prototype->m_keyframes[m_currentKeyFrame]->index)
>> {
>> --m_currentKeyFrame;
>> }
>> ASSERT(m_currentKeyFrame <= keyFrameSize);
>>
>> const CKeyFrame* currentFrame = prototype->m_keyframes[m_curre
>> ntKeyFrame];
>>
>> CRenderingNode* node = m_target->getRenderingNode();
>> if (!currentFrame->hasAnimation)
>> {
>> if (m_previousKeyframeIndex != m_currentKeyFrame)
>> {
>> m_previousKeyframeIndex = m_currentKeyFrame;
>> node->setTransformMatrix(&currentFrame->matrix);
>> node->setColorTransfrom(currentFrame->colorTransform);
>> node->setColorOffset(currentFrame->colorOffset);
>> }
>> }
>> else if (currentFrame->index == frame)
>> {
>> m_previousKeyframeIndex = m_currentKeyFrame;
>> node->setTransformMatrix(&currentFrame->matrix);
>> node->setColorTransfrom(currentFrame->colorTransform);
>> node->setColorOffset(currentFrame->colorOffset);
>> }
>> else
>> {
>> const CKeyFrame* endKeyFrame;
>> if (m_currentKeyFrame == keyFrameSize)
>> {
>> endKeyFrame = currentFrame;
>> }
>> else
>> {
>> endKeyFrame = prototype->m_keyframes[m_currentKeyFrame + 1];
>> }
>> ITween* tween = currentFrame->tween;
>> ASSERT(frame >= currentFrame->index);
>> size_t curTime = frame - currentFrame->index;
>> size_t duration = endKeyFrame->index - currentFrame->index;
>> float positionScale = tween->getPositionValue(curTime, duration);
>>
>> MatrixClass finalMatrix;
>> float curScaleX = SIMPLE_EASE(currentFrame->scaleX, endKeyFrame->scaleX);
>> float curScaleY = SIMPLE_EASE(currentFrame->scaleY, endKeyFrame->scaleY);
>>
>> const float curSkewX = (currentFrame->processedSkewX == 0.0f)
>> ? currentFrame->skewX : (currentFrame->skewX +
>> currentFrame->processedSkewX * positionScale);
>> const float curSkewY = (currentFrame->processedSkewY == 0.0f)
>> ? currentFrame->skewY : (currentFrame->skewY +
>> currentFrame->processedSkewY * positionScale);
>>
>> if (curSkewY == 0)
>> {
>> finalMatrix.m[1][0] = 0;
>> finalMatrix.m[1][1] = curScaleY;
>> }
>> else
>> {
>> const float sinSkeyY = sin(curSkewY);
>> const float cosSkeyY = cos(curSkewY);
>> finalMatrix.m[1][0] = -curScaleY * sinSkeyY;
>> finalMatrix.m[1][1] = curScaleY * cosSkeyY;
>> }
>>
>> if (curSkewX == 0)
>> {
>> finalMatrix.m[0][0] = curScaleX;
>> finalMatrix.m[0][1] = 0;
>> }
>> else
>> {
>> const float cosSkewX = cos(curSkewX);
>> const float sinSkewX = sin(curSkewX);
>> finalMatrix.m[0][0] = curScaleX * cosSkewX;
>> finalMatrix.m[0][1] = curScaleX * sinSkewX;
>> }
>>
>> const float curX = SIMPLE_EASE(currentFrame->x, endKeyFrame->x);
>> const float curY = SIMPLE_EASE(currentFrame->y, endKeyFrame->y);
>> finalMatrix.t.x = curX;
>> finalMatrix.t.y = curY;
>>
>> const CFVector transformPoint = finalMatrix.transformVector(cu
>> rrentFrame->transformPoint);
>> finalMatrix.t.x += curX - transformPoint.x;
>> finalMatrix.t.y += curY - transformPoint.y;
>> node->setTransformMatrix(&finalMatrix);
>> if (currentFrame->colorTransform != endKeyFrame->colorTransform)
>> {
>> CColor colorTransform = easeColor(currentFrame->colorTransform,
>> endKeyFrame->colorTransform, positionScale);
>> node->setColorTransfrom(colorTransform);
>> }
>> else
>> {
>> if (m_currentKeyFrame != m_previousKeyframeIndex)
>> {
>> node->setColorTransfrom(currentFrame->colorTransform);
>> }
>> }
>> if (currentFrame->colorOffset != endKeyFrame->colorOffset)
>> {
>> ColorOffset colorOffset;
>> colorOffset.alphaOffset = 
>> SIMPLE_EASE_SHORT(currentFrame->colorOffset.alphaOffset,
>> endKeyFrame->colorOffset.alphaOffset);
>> colorOffset.redOffset = 
>> SIMPLE_EASE_SHORT(currentFrame->colorOffset.redOffset,
>> endKeyFrame->colorOffset.redOffset);
>> colorOffset.greenOffset = 
>> SIMPLE_EASE_SHORT(currentFrame->colorOffset.greenOffset,
>> endKeyFrame->colorOffset.greenOffset);
>> colorOffset.blueOffset = 
>> SIMPLE_EASE_SHORT(currentFrame->colorOffset.blueOffset,
>> endKeyFrame->colorOffset.blueOffset);
>> node->setColorOffset(colorOffset);
>> }
>> else
>> {
>> if (m_currentKeyFrame != m_previousKeyframeIndex)
>> {
>> node->setColorOffset(currentFrame->colorOffset);
>> }
>> }
>> m_previousKeyframeIndex = m_currentKeyFrame;
>> }
>> }
>>
>> On Mon, Jan 16, 2017 at 9:40 PM, Alon Zakai <alonza...@gmail.com> wrote:
>>
>>> 1. >>>0 converts the number to an unsigned value. It is used for
>>> unsigned comparisons and math operations where the sign matters like
>>> division. This should have no effect on performance, but it does mean
>>> unsigned operations use a little more code.
>>>
>>> 2. The d=e; duplication looks like an LLVM phi that happens to have the
>>> same value on both paths. We should optimize that better. Can you provide
>>> the source code for that function, maybe reduced to just contain that part
>>> and enough code around it so optimizations don't remove it entirely?
>>>
>>> 3. The +-constant issue is something we could optimize better, but it
>>> only matters for code size and only matters fairly little, I believe, so
>>> it's never been a priority. But it is worth doing if someone is interested.
>>>
>>> On Mon, Jan 16, 2017 at 5:59 AM, Sergey Solozhentsev <
>>> sergeysolozhent...@gmail.com> wrote:
>>>
>>>> Hi I have troubles with performance with my code. I looked into
>>>> generated code for one function and found some strange places. My funtion 
>>>> is
>>>> function zy(a,b){
>>>> a=a|0;
>>>> b=b|0;
>>>> var d=0,e=0,f=0,g=0,h=0,j=0,k=0,l=0,m=0,n=0,o=0,p=0,q=0;
>>>> q=i;i=i+16|0;
>>>> m=q+4|0;
>>>> n=q;
>>>> o=c[a+4>>2]|0;
>>>> k=c[o+16>>2]|0;
>>>> l=k+-1|0;
>>>> o=o+12|0;
>>>> g=c[o>>2]|0;
>>>> if((c[(c[g+(l<<2)>>2]|0)+60>>2]|0)>>>0<b>>>0)
>>>> {
>>>> i=q;
>>>> return
>>>> }
>>>> p=a+12|0;
>>>> d=c[p>>2]|0;
>>>> a:
>>>> do
>>>> if(d>>>0<l>>>0)
>>>> while(1)
>>>> {
>>>> e=d+1|0;
>>>> if((c[(c[g+(e<<2)>>2]|0)+60>>2]|0)>>>0>b>>>0)
>>>> break a;
>>>> c[p>>2]=e;
>>>> if(e>>>0<l>>>0)
>>>> d=e;
>>>> else
>>>> {
>>>> d=e;
>>>> break
>>>> }
>>>> }
>>>> while(0);
>>>> e=c[g+(d<<2)>>2]|0;
>>>> f=e+60|0;
>>>> if((c[f>>2]|0)>>>0>b>>>0)
>>>> {
>>>> do
>>>> {
>>>> d=d+-1|0;
>>>> e=c[g+(d<<2)>>2]|0;
>>>> f=e+60|0
>>>> }
>>>> while((c[f>>2]|0)>>>0>b>>>0);
>>>> c[p>>2]=d;
>>>> j=e
>>>> }
>>>> else
>>>> j=e;
>>>> h=c[a+8>>2]|0;
>>>> h=Ml[c[(c[h>>2]|0)+8>>2]&255](h)|0;
>>>> g=c[j+52>>2]|0;
>>>> if(!g)
>>>> {
>>>> d=c[p>>2]|0;
>>>> if((d|0)==(l|0))
>>>> {
>>>> d=k+-2|0;
>>>> c[p>>2]=d
>>>> }
>>>> n=c[j+56>>2]|0;
>>>> m=c[f>>2]|0;
>>>> jm[c[(c[n>>2]|0)+20>>2]&511](n,b-m|0,(c[(c[(c[o>>2]|0)+(d+1<
>>>> <2)>>2]|0)+60>>2]|0)-m|0);
>>>> b=c[(c[o>>2]|0)+((c[p>>2]|0)+1<<2)>>2]|0;
>>>> Fl[c[(c[a>>2]|0)+20>>2]&15](a,j,b,n,h);
>>>> Fl[c[(c[a>>2]|0)+24>>2]&15](a,j,b,n,h);
>>>> Fl[c[(c[a>>2]|0)+28>>2]&15](a,j,b,n,h);
>>>> c[a+16>>2]=c[p>>2];
>>>> i=q;
>>>> return
>>>> }
>>>> else
>>>> {
>>>> e=a+16|0;
>>>> d=c[p>>2]|0;
>>>> if((c[e>>2]|0)==(d|0))
>>>> {
>>>> i=q;return
>>>> }
>>>> c[e>>2]=d;
>>>> QI(h,g);
>>>> jG(n,c[j+40>>2]|0);
>>>> c[m>>2]=c[n>>2];
>>>> EI(h,m);
>>>> FI(h,j+44|0);
>>>> i=q;
>>>> return
>>>> }
>>>> }
>>>>
>>>> I wonder why it is used *b>>>0* and sometimes *b=b|0;* why is not only
>>>> one variant used?
>>>> also there are some optimization issues e.g. in code
>>>> if(e>>>0<l>>>0)
>>>> d=e;
>>>> else
>>>> {
>>>> d=e;
>>>> break
>>>> }
>>>> d = e is executed in any case. Why is it duplicated?
>>>> Also my minus operation is replaced with +- operation. It increase file
>>>> size of every minus opearation.
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "emscripten-discuss" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to emscripten-discuss+unsubscr...@googlegroups.com.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "emscripten-discuss" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to emscripten-discuss+unsubscr...@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "emscripten-discuss" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to emscripten-discuss+unsubscr...@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "emscripten-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to emscripten-discuss+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to emscripten-discuss+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Optimizations questions

Reply via email to