Re: [PD] variance from mapping library

2009-03-23 Thread Mathieu Bouchard

On Mon, 23 Mar 2009, Mathieu Bouchard wrote:

Here is another patch. I call it [mapping/variance2]. It computes a moving 
variance using exactly the $1 last values and not any more than that. It has 
more rounding error, but it doesn't have any unwanted delay between the two 
moving averages used in computing variance.


Well actually, my [mapping/variance2] also doesn't have the same drift 
characteristics (nor other error characteristics) as [mapping/variance], 
so, my examples about [mapping/variance] don't drift when remade with 
[mapping/variance2], but some other examples (that I didn't show) are 
worse with [mapping/variance2].


 _ _ __ ___ _  _ _ ...
| Mathieu Bouchard - tél:+1.514.383.3801, Montréal, Québec___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management -> 
http://lists.puredata.info/listinfo/pd-list


Re: [PD] variance from mapping library

2009-03-23 Thread Mathieu Bouchard

On Mon, 23 Mar 2009, Mathieu Bouchard wrote:

You can send simple repeated sequences to [mapping/variance] to show that not 
only it can drift in the negative values almost endlessly, but it doesn't 
even compute the variance of N values.


Here is another patch. I call it [mapping/variance2]. It computes a moving 
variance using exactly the $1 last values and not any more than that. It 
has more rounding error, but it doesn't have any unwanted delay between 
the two moving averages used in computing variance.


#N canvas 744 170 273 272 10;
#X obj 39 33 inlet;
#X obj 39 184 outlet;
#X obj 39 52 t f f;
#X obj 39 165 -;
#X obj 39 83 t f f;
#X obj 39 111 *;
#X obj 39 130 mean_n \$1;
#X obj 114 81 mean_n \$1;
#X obj 114 101 t f f;
#X obj 114 129 *;
#X connect 0 0 2 0;
#X connect 2 0 4 0;
#X connect 2 1 7 0;
#X connect 3 0 1 0;
#X connect 4 0 5 0;
#X connect 4 1 5 1;
#X connect 5 0 6 0;
#X connect 6 0 3 0;
#X connect 7 0 8 0;
#X connect 8 0 9 0;
#X connect 8 1 9 1;
#X connect 9 0 3 1;

 _ _ __ ___ _  _ _ ...
| Mathieu Bouchard - tél:+1.514.383.3801, Montréal, Québec___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management -> 
http://lists.puredata.info/listinfo/pd-list


Re: [PD] variance from mapping library

2009-03-23 Thread cyrille henry



Mathieu Bouchard a écrit :

On Thu, 19 Mar 2009, Oded Ben-Tal wrote:

...


For the "mapping" library, there isn't much of a choice but to remake it 
with a slower algorithm, unless someone knows a magic trick for 
cancelling almost all of the error while not running so slow. 


no need to remake an other mean_n object using a beter but slower algorythm.
you just have to send a [mode 1< message to mean_n : it does switch the 
internal algo used to a more accurate one.

i just add this message to the variance object...

Cyrille

___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management -> 
http://lists.puredata.info/listinfo/pd-list


Re: [PD] variance from mapping library

2009-03-23 Thread Mathieu Bouchard

On Mon, 23 Mar 2009, Oded Ben-Tal wrote:


the mapping lib is optimized to work with number from -1 to 1.
do you still have error using this kind of number?


It seems that not.


Try again harder. Look at the two mails I've just sent about this subject.

 _ _ __ ___ _  _ _ ...
| Mathieu Bouchard - tél:+1.514.383.3801, Montréal, Québec___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management -> 
http://lists.puredata.info/listinfo/pd-list


Re: [PD] variance from mapping library

2009-03-23 Thread Oded Ben-Tal

the mapping lib is optimized to work with number from -1 to 1.
do you still have error using this kind of number?


It seems that not.
I do get negative numbers even with input of ~200 (no need to get to 
10) and once it gets into negative territory it settles into negative 
numbers whenever the input is constant (i.e. 0 variance).


Oded

___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management -> 
http://lists.puredata.info/listinfo/pd-list


Re: [PD] variance from mapping library

2009-03-23 Thread Mathieu Bouchard

On Mon, 23 Mar 2009, Mathieu Bouchard wrote:

You can send simple repeated sequences to [mapping/variance] to show 
that not only it can drift in the negative values almost endlessly, but 
it doesn't even compute the variance of N values. This is because there 
is a moving average of something involving another moving average. This 
makes a global moving average of 2N-1 values instead of N, and then the 
window isn't rectangular anymore, it's triangular, because different 
values are counted different number of times depending on how old they 
are.


errata: i mean that a moving average of a moving average is another moving 
average with a different window (the convolution of the two windows), but 
the case of [mapping/variance] is more complicated because of the extra 
operations being done, and so there is no easy way to describe it, but you 
can see that it uses too many values in the same way that the moving 
average of the moving average does.


 _ _ __ ___ _  _ _ ...
| Mathieu Bouchard - tél:+1.514.383.3801, Montréal, Québec___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management -> 
http://lists.puredata.info/listinfo/pd-list


Re: [PD] variance from mapping library

2009-03-23 Thread Mathieu Bouchard

On Mon, 23 Mar 2009, cyrille henry wrote:


the mapping lib is optimized to work with number from -1 to 1.
do you still have error using this kind of number?


Well, yes, of course: because float precision is relative, this behaviour 
would happen for any scale of number, except in case of underflow, but 
then, in case of underflow, the result can't be accurate anymore anyway.


You can send simple repeated sequences to [mapping/variance] to show that 
not only it can drift in the negative values almost endlessly, but it 
doesn't even compute the variance of N values. This is because there is a 
moving average of something involving another moving average. This makes a 
global moving average of 2N-1 values instead of N, and then the window 
isn't rectangular anymore, it's triangular, because different values are 
counted different number of times depending on how old they are.


That's why I added extra zeroes in the messagebox. Remove them 
one at a time and you'll see this second phenomenon in addition 
to the first one.


#N canvas 425 84 450 300 10;
#X obj 83 33 tgl 15 0 empty empty empty 17 7 0 10 -24198 -1 -1 1 1;
#X obj 83 69 t b b;
#X obj 83 107 f;
#X obj 83 126 nbx 12 14 -1e+37 1e+37 0 0 empty empty empty 0 -8 0 10
-225271 -1 -1 -0.000188258 256;
#X obj 83 50 metro 1;
#X msg 113 88 1 \, 1 \, 0 \, 0 \, 0 \, 0 \, 0 \, 0 \, 0 \, 0 \, 0;
#X obj 114 107 variance 5;
#X connect 0 0 4 0;
#X connect 1 0 2 0;
#X connect 1 1 5 0;
#X connect 2 0 3 0;
#X connect 4 0 1 0;
#X connect 5 0 6 0;
#X connect 6 0 2 1;

 _ _ __ ___ _  _ _ ...
| Mathieu Bouchard - tél:+1.514.383.3801, Montréal, Québec___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management -> 
http://lists.puredata.info/listinfo/pd-list


Re: [PD] variance from mapping library

2009-03-23 Thread cyrille henry

the mapping lib is optimized to work with number from -1 to 1.
do you still have error using this kind of number?
Cyrille


Mathieu Bouchard a écrit :

On Thu, 19 Mar 2009, Oded Ben-Tal wrote:

abs(a)+abs(b), or something like that. But 10*10 = 
100, and if you divide that by 2^24 = 16777216, you get about 
596, which is an upper bound for the amount of error: so, the error 
is surely between -596 and +596.
I trust your math here but just notice that your example converges to 
-691.


That's because 10*10 is only one value. Then, the second 
[mean_n] has to process + 9*9 + 8*8 + 7*7 + ... 
so the theoretical error maximum is much more than 596 but much less 
than 596*10. In practice, much of the individual errors are not that big 
and perhaps some of them cancel each other.


But to find the reason for why -691 precisely, would take a long time 
and would not be any insightful.


But if I understand you correctly 'filtering' the input data through 
[int] should make variance error free (we hope).


no, it won't, because still all of the other objects process floats. The 
reason why ints wouldn't have that problem is because they have fixed 
precision, that is, the step between two adjacent numbers is 1, whereas 
for floats it's roughly proportional to the numbers themselves. For 
integers you will hit an overflow problem quite quickly, and so, for 
example, if you remake that abstraction using 32-bit integers (for 
example, using the GridFlow library) then you can already get an 
overflow by using random 5-digit numbers, but at least, it goes back to 
normal when given a more modest sequence, whereas for floats it gets 
stuck and needs to be reset (recreated).


Using int64 you could get perfect results for most uses, but I don't 
recall whether the bugs in GridFlow's int64 support were fixed or not... 
I never quite had a use for int64 in the end.


For the "mapping" library, there isn't much of a choice but to remake it 
with a slower algorithm, unless someone knows a magic trick for 
cancelling almost all of the error while not running so slow. Actually, 
it already runs in linear time, so it wouldn't be such a big loss if the 
complete sum was recomputed at every step, because it would still be 
linear. It would be a big loss if it could run in constant time (e.g. 
using an array for the queue) and had to be switched to linear time.


 _ _ __ ___ _  _ _ ...
| Mathieu Bouchard - tél:+1.514.383.3801, Montréal, Québec




___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management -> 
http://lists.puredata.info/listinfo/pd-list


___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management -> 
http://lists.puredata.info/listinfo/pd-list


Re: [PD] variance from mapping library

2009-03-19 Thread Mathieu Bouchard

On Thu, 19 Mar 2009, Oded Ben-Tal wrote:

abs(a)+abs(b), or something like that. But 10*10 = 100, and 
if you divide that by 2^24 = 16777216, you get about 596, which is an upper 
bound for the amount of error: so, the error is surely between -596 and 
+596.

I trust your math here but just notice that your example converges to -691.


That's because 10*10 is only one value. Then, the second [mean_n] 
has to process + 9*9 + 8*8 + 7*7 + ... so the 
theoretical error maximum is much more than 596 but much less than 596*10. 
In practice, much of the individual errors are not that big and perhaps 
some of them cancel each other.


But to find the reason for why -691 precisely, would take a long time and 
would not be any insightful.


But if I understand you correctly 'filtering' the input data through [int] 
should make variance error free (we hope).


no, it won't, because still all of the other objects process floats. The 
reason why ints wouldn't have that problem is because they have fixed 
precision, that is, the step between two adjacent numbers is 1, whereas 
for floats it's roughly proportional to the numbers themselves. For 
integers you will hit an overflow problem quite quickly, and so, for 
example, if you remake that abstraction using 32-bit integers (for 
example, using the GridFlow library) then you can already get an overflow 
by using random 5-digit numbers, but at least, it goes back to normal when 
given a more modest sequence, whereas for floats it gets stuck and needs 
to be reset (recreated).


Using int64 you could get perfect results for most uses, but I don't 
recall whether the bugs in GridFlow's int64 support were fixed or not... I 
never quite had a use for int64 in the end.


For the "mapping" library, there isn't much of a choice but to remake it 
with a slower algorithm, unless someone knows a magic trick for cancelling 
almost all of the error while not running so slow. Actually, it already 
runs in linear time, so it wouldn't be such a big loss if the complete sum 
was recomputed at every step, because it would still be linear. It would 
be a big loss if it could run in constant time (e.g. using an array for 
the queue) and had to be switched to linear time.


 _ _ __ ___ _  _ _ ...
| Mathieu Bouchard - tél:+1.514.383.3801, Montréal, Québec___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management -> 
http://lists.puredata.info/listinfo/pd-list


Re: [PD] variance from mapping library

2009-03-19 Thread Oded Ben-Tal

and then changing 10 to 0 makes the result converge to -691.2.

actually, afaik, [variance] doesn't have a bug by itself. The bug is in 
[mean_n], which displays similar behaviour in default mode.


Yes that was my thought as well because the variance abstraction looks 
correct.




The bug is because of algebraic assumptions that don't work with floats. With 
real numbers, a+b-a-b = 0, but with floats, a+b-a-b is only guaranteed to be 
a "small" number, that is, less than 2^24 times smaller than abs(a)+abs(b), 
or something like that. But 10*10 = 100, and if you divide 
that by 2^24 = 16777216, you get about 596, which is an upper bound for the 
amount of error: so, the error is surely between -596 and +596.




I trust your math here but just notice that your example converges to 
-691. But if I understand you correctly 'filtering' the input data through 
[int] should make variance error free (we hope).


thanks
Oded

___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management -> 
http://lists.puredata.info/listinfo/pd-list


Re: [PD] variance from mapping library

2009-03-19 Thread Mathieu Bouchard

On Thu, 19 Mar 2009, Oded Ben-Tal wrote:


I am getting a negative number occasionally.


Ok, i isolated an example:

[metro 100]
 |
[10]
 |
[variance 10]
 |
converges to -281.6

and then changing 10 to 0 makes the result converge to -691.2.

actually, afaik, [variance] doesn't have a bug by itself. The bug is in 
[mean_n], which displays similar behaviour in default mode.


The bug is because of algebraic assumptions that don't work with floats. 
With real numbers, a+b-a-b = 0, but with floats, a+b-a-b is only 
guaranteed to be a "small" number, that is, less than 2^24 times smaller 
than abs(a)+abs(b), or something like that. But 10*10 = 
100, and if you divide that by 2^24 = 16777216, you get about 596, 
which is an upper bound for the amount of error: so, the error is surely 
between -596 and +596.


Then [mean_n] boosts this error to the max by adding it all together. 
Statistically, that error could diverge.


So, the shortcut of keeping a total of a list of length N and only add the 
new element to it and subtract the oldest element from it, is not 
something that works with all floats. It's a trick that works with an int 
type, but with floats, it only works sometimes.


 _ _ __ ___ _  _ _ ...
| Mathieu Bouchard - tél:+1.514.383.3801, Montréal, Québec___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management -> 
http://lists.puredata.info/listinfo/pd-list


Re: [PD] variance from mapping library

2009-03-19 Thread Oded Ben-Tal


Is it really a negative number, or is it more like 9.74915e-05 ? In the 
latter case, only the exponent is negative, which means 
9.74915/10/10/10/10/10 = 0.974915, whereas with a plus sign like 1.2e+06 
would mean 1.2*10*10*10*10*10*10 = 120.




I am getting a negative number occasionally.
Oded

___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management -> 
http://lists.puredata.info/listinfo/pd-list


Re: [PD] variance from mapping library

2009-03-18 Thread Mathieu Bouchard

On Wed, 18 Mar 2009, Oded Ben-Tal wrote:

when using variance (from mapping library) - I get negative numbers. 
Since variance is supposed to be always positive and I don't see 
anything wrong in the way the abstraction is done, I presume there is 
something I'm missing. My impression is that small changes in the input 
translate to negative 'variance'


Is it really a negative number, or is it more like 9.74915e-05 ? In the 
latter case, only the exponent is negative, which means 
9.74915/10/10/10/10/10 = 0.974915, whereas with a plus sign like 
1.2e+06 would mean 1.2*10*10*10*10*10*10 = 120.


 _ _ __ ___ _  _ _ ...
| Mathieu Bouchard - tél:+1.514.383.3801, Montréal, Québec___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management -> 
http://lists.puredata.info/listinfo/pd-list


[PD] variance from mapping library

2009-03-18 Thread Oded Ben-Tal

I'm not sure if it's a bug but:
when using variance (from mapping library) - I get negative numbers. Since 
variance is supposed to be always positive and I don't see anything wrong in the way 
the abstraction is done, I presume there is something I'm missing.
My impression is that small changes in the input translate to negative 
'variance'




___
Oded Ben-Tal
http://ccrma.stanford.edu/~oded
o...@ccrma.stanford.edu

___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management -> 
http://lists.puredata.info/listinfo/pd-list