On Wed, Aug 6, 2008 at 7:14 AM, Matthew Toseland
<toad at amphibian.dyndns.org> wrote:
> On Tuesday 05 August 2008 23:11, Ed Tomlinson wrote:
>> On August 5, 2008, Matthew Toseland wrote:
>> > On Sunday 03 August 2008 00:58, you wrote:
>> > > On August 2, 2008, Matthew Toseland wrote:
>> > > > On Saturday 02 August 2008 02:41, Ed Tomlinson wrote:
>> > > > > On August 1, 2008, Michael Rogers wrote:
>> > > > > > Daniel Cheng wrote:
>> > > > > > > in a circular space, we can get infinite number of "average" by
>> > changing
>> > > > > > > point of reference. --- choose the point of reference with the
>> > smallest
>> > > > > > > standard deviation.
>> > > > > >
>> > > > > > From an infinite number of alternatives? Sounds like it might take
> a
>> > > > > > while. ;-) I think we can get away with just trying each location
> as
>> > the
>> > > > > > reference point, but that's still O(n^2) running time.
>> > > > > >
>> > > > > > How about this: the average of two locations is the location
> midway
>> > > > > > along the shortest line between them. So to estimate the average
> of a
>> > > > > > set of locations, choose two locations at random from the set and
>> > > > > > replace them with their average, and repeat until there's only one
>> > > > > > location in the set.
>> > > > > >
>> > > > > > It's alchemy but at least it runs in linear time. :-)
>> > > > >
>> > > > > Another idea that should run in linear time.  Consider each key a
> point
>> > on
>> > > > the edge
>> > > > > of a circle (with a radius of 1).  Convert each key (0=0 degress,
> 1=360)
>> > to
>> > > > an x, y cord and
>> > > > > average these numbers.  Once all keys are averaged convert the (x,
> y)
>> > back
>> > > > into a key to
>> > > > > get the average.
>> > > > >
>> > > > > eg    x = sin (key * 360), y = cos(key * 360)  assuming the angle is
> in
>> > > > degrees not radians.
>> > > > >       where a key is a number between 0 and 1
>> > > >
>> > > > You miss the point. We already have what is effectively an angle, it's
>> > just in
>> > > > 0 to 1 instead of 0 to 360 deg / 2*pi rads. The problem is the
> circular
>> > > > keyspace.
>> > >
>> > > No I have _not_ missed the point.  If you map each key onto the rim of a
>> > circle and
>> > > average resulting the x and y coords of all the keys you get an average
> in a
>> > circular
>> > > keyspace.  Try it.
>> > >
>> > > If fact radius of the averaged x, y will also be a measure of just how
>> > specialized your
>> > > store is... (eg r = sqrt(average(x cords)^2+average(y cords)^2)
>> >
>> > Cool! So the principle is that the closer the mid-point is to being
> actually
>> > on the circle, the more specialised the store?
>>
>> Yes.  'r' should be between 0 and 1.  The closer to 0 the less specialized
> the store is.
>> It is not a perfect measure of specialization though.  If the store is
> specialized in
>> two areas 180 degrees apart or 3 areas 120 degrees apart and so on, r could
> be small
>> with the store fairly specialized...
>>
>> > Somebody should implement this... We don't need to keep the actual
> samples, we
>> > can just keep a running average of x and y, right? What about sensitivity,
>> > can we reuse the bootstrapping-decaying-running-average code? Aka klein
>> > filter with sensitivity reducing over time so that for the first N samples
>> > it's effectively a simple running average and after that it's a klein
> filter
>> > with sensitivity equal to that at the end of the first phase? Or would we
>> > need to use a running average and reset it periodically?
>>
>> you just need to keep the totals for x and y along with the number of keys
> (call this n) .  When a
>> key is removed from the store subtract its x, y coord and reduce n by 1.
> reverse this when adding
>> a key...
>
> That could work for the datastore yes, assuming it's persistent; meaning it
> will gradually become more inaccurate with unclean shutdowns. and we can't
> just wipe it either because then we'd have to reconstruct it from the store
> before adding more; maybe it could be integrated with bloom filter
> reconstruction for sdiz's store?

The original routing key will not be stored in the new store, only a
SHA256-digested version will be stored.

Do we need to store the location for each key? (so that we can update
the average location when the key is removed)
Maybe a running average is just good enough?
Or just store an approximated location, not the actual value?

> For requests we'd need something more like a BDRA anyway.
>>
>> Ed
>
> _______________________________________________
> Devl mailing list
> Devl at freenetproject.org
> http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
>

Reply via email to