On 9/4/07, Erik Søe Sørensen <[EMAIL PROTECTED]> wrote:
>
> Hello,
> I have been doing some profiling on Globulation 2, and found some useful
> results.
> (I began doing this because the Ubuntu release (0.8.21) had periods of
> 100% CPU activity, but that seems to have been fixed :-)
>
> Map::updateLocalGradient() seems to be a bottleneck (~40% of the time
> spent there).
> In this function, there are at least two opportunities for speed
> improvement.
>
> For 21% improvement (of the function's run time), get rid of the array
> introduced in Map.cpp:3889:
> /----
> --- Map.cpp.org  2007-09-04 05:20:41.000000000 +0200
> +++ Map.cpp.erk  2007-09-04 05:23:52.000000000 +0200
> @@ -3886,22 +3886,17 @@
>                   else
>                     xr=x+1;
>
> -                Uint8 side[8];
> -                side[0]=gradient[wyu+xl];
> -                side[1]=gradient[wyu+x ];
> -                side[2]=gradient[wyu+xr];
> -
> -                side[3]=gradient[wy +xr];
> -
> -                side[4]=gradient[wyd+xr];
> -                side[5]=gradient[wyd+x ];
> -                side[6]=gradient[wyd+xl];
> -
> -                side[7]=gradient[wy +xl];
> -
> -                for (int i=0; i<8; i++)
> -                  if (side[i]>max)
> -                    max=side[i];
> +#define UPDATE_MAX(xy) {Uint8 tmp = gradient[(xy)]; if (tmp>max)
> max=tmp;}
> +                UPDATE_MAX(wyu+xl);
> +                UPDATE_MAX(wyu+x );
> +                UPDATE_MAX(wyu+xr);
> +                UPDATE_MAX(wy +xr);
> +                UPDATE_MAX(wyd+xr);
> +                UPDATE_MAX(wyd+x );
> +                UPDATE_MAX(wyd+xl);
> +                UPDATE_MAX(wy +xl);
> +#undef UPDATE_MAX
> +
>                   assert(max);
>                   if (max==1)
>                     gradient[wy+x]=1;
> \----
>
> For further ~9% improvement, avoid copying a Case structure in
> Map.cpp:3369, still in updateLocalGradient():
> /----
> --- Map.cpp.org 2007-09-04 05:20:41.000000000 +0200
> +++ Map.cpp.erk  2007-09-04 05:28:23.000000000 +0200
> @@ -3366,7 +3366,7 @@
>           for (int xl=0; xl<32; xl++)
>           {
>             int xg=(xl+posX-15)&wMask;
> -           Case c=cases[wyg+xg];
> +           Case& c=cases[wyg+xg];
>             int addrl=wyl+xl;
>             if (gradient[addrl]!=255)
>             {
> \----
> (This happens at several other locations, by the way... the second-worst
> place in Map.cpp appears to be in Map::updateRessourcesGradient(), and
> it'might also be worth the attention :-)
>
> And another thing, now we're at reading code:
> A couple of places in Map::updateLocalGradient(), there is code like this:
>          if (xxi<0)
>             xxi=0;
>         else if (xxi>31)
>             xxi=31;
>        if (yyi<0)
>             yyi=0;
>         else if (yyi>31)
>             xxi=31; // <-- Shouldn't this be yyi??
> That last line does look like a typo (or more likely a copy'n'paste
> error?).
>
> Regards,
> Erik Søe Sørensen
>
> PS: Great looking game, by the way! :-)
>
>
> [Profiling done on a ThinkPad, CPU: Intel(R) Pentium(R) M processor
> 1300MHz stepping 05]
> [Today's lesson: Don't begin profiling on a non-optimized build.]



Thats very impressive, I myself have done extensive optimization, but I
didn't tackle the main gradient algorithms, mostly I did my own ones that
are used in the Echo AI system. I couldn't make my algorithm any faster, so
instead I distributed the algorithm over multiple frames, which makes echo,
while less responsive, more smooth.

Anyhow, coincidentally, today I was actually working out how to optimize
gradient algorithms in a more computer-science type way. We can do one or
the other of the following, but both of them require keeping track of
changes to the map.

1) We initialize the gradient for every turn, when most of this information
is repeated. We could save a small amount of time by instead keeping an
ongoing initial value, updating it instead when the values themselves are
updated. Since there are two different initialization routines (for
different kinds of paths), we need to keep two of these. When doing the
gradient itself, instead of going through the entire map filtering changing
etc.. We can just do straight copy of the values, which should be faster
than accessing the whole map.

2) Alternatively, we can do a continuous update system, rather than a
complete refresh. Basically, we keep track of changes on the map, where a
change is from a path obstacle to a non obstacle, or vice versa.

a) for each obstacle->non obstacle, at that position, take the value of the
smallest neighbor, then propagate that value like the normal algorithm does,
except this algorithm only propagates into cells that will change with this
prorogation. This means that, most of the time, the algorithm will quit
quickly.

b) for each non obstacle->obstacle, mark each neighbor that has a higher
value than the cell had before it changed as 0. Propagate into these cells,
doing the same thing, propagating into each neighbor that has a higher
value, and leaving behind the new values of 0. This is basically marking the
region that is affected by the new obstacle. Each neighbor that had an equal
or lower value was marked as a source and added to a list (or set, so
duplicates don't occur). After this portion of the algorithm exits, the next
portion fills in the damaged portion of the map by propagating from the
sources like a normal gradient algorithm would.

These two updates don't have to be done in parallel, one can do a, then b or
vice versa it won't affect the result. The danger here is that the same
value may be touched many times, but in reality, the glob2 world doesn't
change much between updates, so may improve performance.

-- 
Extra cheese comes at a cost. Bradley Arsenault.
_______________________________________________
glob2-devel mailing list
glob2-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/glob2-devel

Reply via email to