On Thu, 9 Nov 2006, Marius Schebella wrote:
for me that's a really important topic, I often run into problems with
slow machines not fast enough to play patches.
With video this happens often, even on fast machines, and especially with
GridFlow: e.g. it's not possible to use [#fft] at 30fps unless your
resolution is really small.
I wonder if it is possible to calculate something like flops/ FLOating
Point OPerations per object
It wouldn't be just a count of flops; that's a rather useless unit of
measure unless you know that all your flops take the same amount of time,
and what you care about is the time. In Numerical Analysis,
multiplications and additions are usually counted separately, because
they're expected to be in two different classes of speed.
and have a list for all the pd objects.
This would have to be parametrized according to some things, like length
of list arguments, block size, and possibly a lot of arguments.
Things like [fft~] does more work per sample when the blocksize is larger;
i suspect that fiddle's situation is at least somewhat similar, but I
haven't tested.
GEM/PDP would be harder due to framesize differences and to how the [EMAIL PROTECTED]
one is supposed to measure time spent on the GPU.
I expect GridFlow to be a lot harder to measure; e.g. while pix_convolve
will take time that's about the size of the picture (in pixels) times
the size of the kernel (in pixels), in GridFlow you should only consider
the number of nonzero entries in the kernel (!!). And then [#convolve]
has special options like "op" and "fold" which aren't in any other
implementation of convolution that I've seen in pd, and that can change
the run time radically. And then [#convolve] supports *any* number of
channels, while [pix_convolve] is up to only 4. And so on...
it really would be great to know the benchmarks of different
hardwaresystems. marius.
Even though it's impossible to get a complete picture about the speed of
each class, I think that it's worth trying. However, this may require some
modifications to Pd. It's possible to make benchmarks in pure pd, but this
would require a big mess of [timer] and [t] objects in order to prevent
sent messages to be counted as part of the object's running time. If it
were done in C in a similar way, it would be much faster, which would be
important in order to have sufficiently accurate figures.
Even then, I fear that it wouldn't be that accurate, when lots of short
operations are made. In that case, a statistical profiler would be more
appropriate.
_ _ __ ___ _____ ________ _____________ _____________________ ...
| Mathieu Bouchard - tél:+1.514.383.3801 - http://artengine.ca/matju
| Freelance Digital Arts Engineer, Montréal QC Canada
_______________________________________________
PD-list@iem.at mailing list
UNSUBSCRIBE and account-management ->
http://lists.puredata.info/listinfo/pd-list