Re: [freenet-dev] Routing question

2015-10-24 Thread Matthew Toseland
On 19/10/15 18:33, Bert Massop wrote:
> On 18-10-15 21:08, Matthew Toseland wrote:
>> On 18/10/15 14:33, Florent Daigniere wrote:
>>> On Tue, 2015-10-13 at 22:26 +0200, Bert Massop wrote:
 On 10-10-15 21:01, nextg...@freenetproject.org wrote:
> Oi!
>
> Given https://github.com/freenet/fred/blob/next/src/freenet/node/Pe
> erManager.java#L1049
>
> We seem to get our peers location and their peer's location...
> Iterate over the whole lot (150^2 entries max)
> ...
> And then for each of them, ensure that they haven't been routed to
> before by iterating over a list of locations we've already routed
> to (18 is the max HTL)
> ...
> That's 18*150^2 => 405k iterations in the worst case ... on the
> hot-path (each request we route will go through it)!
 Nice find!
 One more thing: you forgot the recentlyFailed case, where it recurses
 twice (line 1173 and 1183), leading to a total of 1.2M iterations
 worst
 case (however this case is extremely unlikely to occur).

> It seems very naive.
>
> How can we do better? Locations are double, what's the best data-
> structure here?
> I'm rusty: is a TreeMap with a custom comparator (NavigableMap)
> what we want?
 At least keeping FOAF peers locations sorted would allow for binary
 search, i.e. O(log n) instead of O(n) search, at the cost of an
 initial
 O(n log n) sort. That would however require reworking
 PeerNode.getPeersLocation() to return (and maintain) a sorted array.

 If one would then also keep the routedTo (of size m) sorted by
 location,
 we can find the FOAF (assuming peer count n for ourself and all of
 our
 peers) closest to the target location not in routedTo in O(log n + m)
 time for each of our peers, and when done intelligently, find our
 peer
 with the FOAF closest to the target O(n log n + nm + m log m) for
 *all*
 of our peers. That would bring the iteration count down around 100-
 fold
 from the current situation.

 For any further optimization, the hardest part would be to figure out
 all the corner cases present in the current spaghetti. ☺

>>> That'd work nicely if we weren't doing the operations modulo
>>> Double.MIN_VALUE. Is there any reason why our locations doubles and not
>>> BigDecimals?
>>>
>>> http://docs.oracle.com/javase/7/docs/api/java/math/BigDecimal.html
> Our locations should in the end use the full 256 bit locations or an
> exact prefix of that, immediately implying a circular keyspace (add one
> to the max, end up in 0 again). There is no need to use BigDecimals
> here, locations don't get arbitrarily large and we don't need to do
> arithmetic except for change / distance calculation.
>
> Apart from that, what's the idea behind the Double.MIN_VALUE after all?
> If that's intended to capture minuscule calculation errors, it won't
> work: Double.MIN_VALUE is somewhere around 1e-150, while the distance
> between two consecutive double values in the range [0.5, 1.0) is
> somewhere around 1e-16 (or equal to Math.ulp(0.5)), i.e. Math.abs(x - y)
> where x and y are in the range of [0.5, 1.0) will never return a value
> as small as Double.MIN_VALUE. Also, Math.abs is guaranteed never to
> return negative zero, so checking for that would be pointless too.
It's a standard Java idiom. And yes, it's wrong. Because Oracle doesn't
provide what is actually needed - 2^-mantissasize or something...?

However I agree we should just use long's for locations...
>>> Toad?
>> Originally speed. Of course we can look up nodes by a Double - but we
>> need to look for >= and <=, so it's more complicated, and I don't recall
>> whether the API allows it cheaply (you used to have to create a bunch of
>> objects e.g. iterators etc to get next-key-after-target). I agree that
>> being able to do an exact match might be useful for some optimisations.
>>> Any objection to that being changed?
>> How about using a long? More bits than a double, at least for the vast
>> majority of the space. And we can map 0.5 to 0, so sign isn't a big deal.
>>
>> Note that we use locations in both routing and swapping, so it might
>> affect more code.
> Why not map the range [0.0, 1.0) to [0L, 0xL]? Then we'd
> just use a 64 bit prefix of the 256 bit location. Just ignore the
> signedness of the longs, we don't do arbitrary arithmetic anyhow.
Except in swapping, where we may need to convert back to doubles.
> There still is my location-as-object branch[0] that rewrites the entire
> location logic into an object-oriented approach where locations are
> immutable objects. The branch is somewhat outdated, but you get the
> idea. Once that's in, it would be trivial to switch implementations to
> actually use any internal representation.
>
> [0] https://github.com/freenet/fred/compare/next...bertm:location-as-object
That sounds like a good idea. Afraid I can't review it right now

Re: [freenet-dev] Routing question

2015-10-24 Thread Matthew Toseland
On 19/10/15 15:07, Florent Daigniere wrote:
> On Mon, 2015-10-19 at 14:11 +0100, Matthew Toseland wrote:
>> I still think you should separate optimising routing (trees of all
>> locations etc, long's instead of doubles) from changing behaviour
>> (ignoring FOAF peers if they were also the peers of a node we've
>> already
>> visited).
> That's what I'm trying to do here... I haven't done the last level of
> optimization where we combine all of them into a single structure
> (instead of iterating on each peer). It's still O(n)+
>
> Here's how it's wired-in
> https://github.com/nextgens/fred/blob/optimize-closerpeer/src/freenet/n
> ode/PeerManager.java#L1054
>
> As you can see, the logic is preserved, with the exception of how we
> handle what's in the exclude list... What I'm wondering is whether that
> new behaviour is fine or not. Do we need to compare what's in the
> exclude list modulo Double.MIN_VALUE (old  behaviour) or is an exact
> match fine (new behaviour)?
Double.MIN_VALUE is only really needed when we're doing arithmetic on
doubles to get new locations. I don't think we are here. It should be fine?

Granted we do pass them around - in messages as doubles, in node
references as text, so maybe
> If it's not I just need more code to make it mirror the old one.
>>  Does our FOAF locations-of-peers announcement include nodes
>> which are backed off for long periods? Nodes which are disconnected?
> It's complicated. There are several checks before and after; I'm not
> sure what the behaviour should be.
The new code needs to mirror that - that's a bigger deal IIRC.



signature.asc
Description: OpenPGP digital signature
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Routing question

2015-10-19 Thread Bert Massop
On 18-10-15 21:08, Matthew Toseland wrote:
> On 18/10/15 14:33, Florent Daigniere wrote:
>> On Tue, 2015-10-13 at 22:26 +0200, Bert Massop wrote:
>>> On 10-10-15 21:01, nextg...@freenetproject.org wrote:
 Oi!

 Given https://github.com/freenet/fred/blob/next/src/freenet/node/Pe
 erManager.java#L1049

 We seem to get our peers location and their peer's location...
 Iterate over the whole lot (150^2 entries max)
 ...
 And then for each of them, ensure that they haven't been routed to
 before by iterating over a list of locations we've already routed
 to (18 is the max HTL)
 ...
 That's 18*150^2 => 405k iterations in the worst case ... on the
 hot-path (each request we route will go through it)!
>>> Nice find!
>>> One more thing: you forgot the recentlyFailed case, where it recurses
>>> twice (line 1173 and 1183), leading to a total of 1.2M iterations
>>> worst
>>> case (however this case is extremely unlikely to occur).
>>>
 It seems very naive.

 How can we do better? Locations are double, what's the best data-
 structure here?
 I'm rusty: is a TreeMap with a custom comparator (NavigableMap)
 what we want?
>>> At least keeping FOAF peers locations sorted would allow for binary
>>> search, i.e. O(log n) instead of O(n) search, at the cost of an
>>> initial
>>> O(n log n) sort. That would however require reworking
>>> PeerNode.getPeersLocation() to return (and maintain) a sorted array.
>>>
>>> If one would then also keep the routedTo (of size m) sorted by
>>> location,
>>> we can find the FOAF (assuming peer count n for ourself and all of
>>> our
>>> peers) closest to the target location not in routedTo in O(log n + m)
>>> time for each of our peers, and when done intelligently, find our
>>> peer
>>> with the FOAF closest to the target O(n log n + nm + m log m) for
>>> *all*
>>> of our peers. That would bring the iteration count down around 100-
>>> fold
>>> from the current situation.
>>>
>>> For any further optimization, the hardest part would be to figure out
>>> all the corner cases present in the current spaghetti. ☺
>>>
>> That'd work nicely if we weren't doing the operations modulo
>> Double.MIN_VALUE. Is there any reason why our locations doubles and not
>> BigDecimals?
>>
>> http://docs.oracle.com/javase/7/docs/api/java/math/BigDecimal.html

Our locations should in the end use the full 256 bit locations or an
exact prefix of that, immediately implying a circular keyspace (add one
to the max, end up in 0 again). There is no need to use BigDecimals
here, locations don't get arbitrarily large and we don't need to do
arithmetic except for change / distance calculation.

Apart from that, what's the idea behind the Double.MIN_VALUE after all?
If that's intended to capture minuscule calculation errors, it won't
work: Double.MIN_VALUE is somewhere around 1e-150, while the distance
between two consecutive double values in the range [0.5, 1.0) is
somewhere around 1e-16 (or equal to Math.ulp(0.5)), i.e. Math.abs(x - y)
where x and y are in the range of [0.5, 1.0) will never return a value
as small as Double.MIN_VALUE. Also, Math.abs is guaranteed never to
return negative zero, so checking for that would be pointless too.

>>
>> Toad?
> Originally speed. Of course we can look up nodes by a Double - but we
> need to look for >= and <=, so it's more complicated, and I don't recall
> whether the API allows it cheaply (you used to have to create a bunch of
> objects e.g. iterators etc to get next-key-after-target). I agree that
> being able to do an exact match might be useful for some optimisations.
>> Any objection to that being changed?
> How about using a long? More bits than a double, at least for the vast
> majority of the space. And we can map 0.5 to 0, so sign isn't a big deal.
> 
> Note that we use locations in both routing and swapping, so it might
> affect more code.

Why not map the range [0.0, 1.0) to [0L, 0xL]? Then we'd
just use a 64 bit prefix of the 256 bit location. Just ignore the
signedness of the longs, we don't do arbitrary arithmetic anyhow.

There still is my location-as-object branch[0] that rewrites the entire
location logic into an object-oriented approach where locations are
immutable objects. The branch is somewhat outdated, but you get the
idea. Once that's in, it would be trivial to switch implementations to
actually use any internal representation.

[0] https://github.com/freenet/fred/compare/next...bertm:location-as-object

— Bert



signature.asc
Description: OpenPGP digital signature
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Routing question

2015-10-19 Thread Florent Daigniere
On Mon, 2015-10-19 at 14:11 +0100, Matthew Toseland wrote:
> On 19/10/15 10:37, Florent Daigniere wrote:
> > On Sun, 2015-10-18 at 20:08 +0100, Matthew Toseland wrote:
> > > On 18/10/15 14:33, Florent Daigniere wrote:
> > > > On Tue, 2015-10-13 at 22:26 +0200, Bert Massop wrote:
> > > > > On 10-10-15 21:01, nextg...@freenetproject.org wrote:
> > > > > > Oi!
> > > > > > 
> > > > > > Given https://github.com/freenet/fred/blob/next/src/freenet
> > > > > > /nod
> > > > > > e/Pe
> > > > > > erManager.java#L1049
> > > > > > 
> > > > > > We seem to get our peers location and their peer's
> > > > > > location...
> > > > > > Iterate over the whole lot (150^2 entries max)
> > > > > > ...
> > > > > > And then for each of them, ensure that they haven't been
> > > > > > routed
> > > > > > to
> > > > > > before by iterating over a list of locations we've already
> > > > > > routed
> > > > > > to (18 is the max HTL)
> > > > > > ...
> > > > > > That's 18*150^2 => 405k iterations in the worst case ... on
> > > > > > the
> > > > > > hot-path (each request we route will go through it)!
> > > > > Nice find!
> > > > > One more thing: you forgot the recentlyFailed case, where it
> > > > > recurses
> > > > > twice (line 1173 and 1183), leading to a total of 1.2M
> > > > > iterations
> > > > > worst
> > > > > case (however this case is extremely unlikely to occur).
> > > > > 
> > > > > > It seems very naive.
> > > > > > 
> > > > > > How can we do better? Locations are double, what's the best
> > > > > > data-
> > > > > > structure here?
> > > > > > I'm rusty: is a TreeMap with a custom comparator
> > > > > > (NavigableMap)
> > > > > > what we want?
> > > > > At least keeping FOAF peers locations sorted would allow for
> > > > > binary
> > > > > search, i.e. O(log n) instead of O(n) search, at the cost of
> > > > > an
> > > > > initial
> > > > > O(n log n) sort. That would however require reworking
> > > > > PeerNode.getPeersLocation() to return (and maintain) a sorted
> > > > > array.
> > > > > 
> > > > > If one would then also keep the routedTo (of size m) sorted
> > > > > by
> > > > > location,
> > > > > we can find the FOAF (assuming peer count n for ourself and
> > > > > all
> > > > > of
> > > > > our
> > > > > peers) closest to the target location not in routedTo in
> > > > > O(log n
> > > > > + m)
> > > > > time for each of our peers, and when done intelligently, find
> > > > > our
> > > > > peer
> > > > > with the FOAF closest to the target O(n log n + nm + m log m)
> > > > > for
> > > > > *all*
> > > > > of our peers. That would bring the iteration count down
> > > > > around
> > > > > 100-
> > > > > fold
> > > > > from the current situation.
> > > > > 
> > > > > For any further optimization, the hardest part would be to
> > > > > figure
> > > > > out
> > > > > all the corner cases present in the current spaghetti. ☺
> > > > > 
> > > > That'd work nicely if we weren't doing the operations modulo
> > > > Double.MIN_VALUE. Is there any reason why our locations doubles
> > > > and
> > > > not
> > > > BigDecimals?
> > > > 
> > > > http://docs.oracle.com/javase/7/docs/api/java/math/BigDecimal.h
> > > > tml
> > > > 
> > > > Toad?
> > > Originally speed. Of course we can look up nodes by a Double -
> > > but we
> > > need to look for >= and <=, so it's more complicated, and I don't
> > > recall
> > > whether the API allows it cheaply (you used to have to create a
> > > bunch
> > > of
> > > objects e.g. iterators etc to get next-key-after-target). I agree
> > > that
> > > being able to do an exact match might be useful for some
> > > optimisations.
> > > > Any objection to that being changed?
> > > How about using a long? More bits than a double, at least for the
> > > vast
> > > majority of the space. And we can map 0.5 to 0, so sign isn't a
> > > big
> > > deal.
> > > 
> > > Note that we use locations in both routing and swapping, so it
> > > might
> > > affect more code.
> > I've written some code overnight: 
> > https://github.com/nextgens/fred/blob/optimize-closerpeer/test/free
> > net/
> > node/LocationTest.java
> > https://github.com/nextgens/fred/blob/optimize-closerpeer/src/freen
> > et/n
> > ode/Location.java
> > 
> > It sort-of-works; the exclusion list is matched exactly as opposed
> > to
> > modulo Double.MIN_VALUE... is that good enough?
> And the exclusion list is a set of peers' peers which we assume we've
> already routed to?
> 

Yeah; that and both our current location and the previous one.

> I still think you should separate optimising routing (trees of all
> locations etc, long's instead of doubles) from changing behaviour
> (ignoring FOAF peers if they were also the peers of a node we've
> already
> visited).

That's what I'm trying to do here... I haven't done the last level of
optimization where we combine all of them into a single structure
(instead of iterating on each peer). It's still O(n)+

Here's how it's wired-in
https://github.com/nextgens/fred/blob/optimize-closerpeer/src/freenet/n
ode/PeerManager

Re: [freenet-dev] Routing question

2015-10-19 Thread Matthew Toseland
On 19/10/15 10:37, Florent Daigniere wrote:
> On Sun, 2015-10-18 at 20:08 +0100, Matthew Toseland wrote:
>> On 18/10/15 14:33, Florent Daigniere wrote:
>>> On Tue, 2015-10-13 at 22:26 +0200, Bert Massop wrote:
 On 10-10-15 21:01, nextg...@freenetproject.org wrote:
> Oi!
>
> Given https://github.com/freenet/fred/blob/next/src/freenet/nod
> e/Pe
> erManager.java#L1049
>
> We seem to get our peers location and their peer's location...
> Iterate over the whole lot (150^2 entries max)
> ...
> And then for each of them, ensure that they haven't been routed
> to
> before by iterating over a list of locations we've already
> routed
> to (18 is the max HTL)
> ...
> That's 18*150^2 => 405k iterations in the worst case ... on the
> hot-path (each request we route will go through it)!
 Nice find!
 One more thing: you forgot the recentlyFailed case, where it
 recurses
 twice (line 1173 and 1183), leading to a total of 1.2M iterations
 worst
 case (however this case is extremely unlikely to occur).

> It seems very naive.
>
> How can we do better? Locations are double, what's the best
> data-
> structure here?
> I'm rusty: is a TreeMap with a custom comparator (NavigableMap)
> what we want?
 At least keeping FOAF peers locations sorted would allow for
 binary
 search, i.e. O(log n) instead of O(n) search, at the cost of an
 initial
 O(n log n) sort. That would however require reworking
 PeerNode.getPeersLocation() to return (and maintain) a sorted
 array.

 If one would then also keep the routedTo (of size m) sorted by
 location,
 we can find the FOAF (assuming peer count n for ourself and all
 of
 our
 peers) closest to the target location not in routedTo in O(log n
 + m)
 time for each of our peers, and when done intelligently, find our
 peer
 with the FOAF closest to the target O(n log n + nm + m log m) for
 *all*
 of our peers. That would bring the iteration count down around
 100-
 fold
 from the current situation.

 For any further optimization, the hardest part would be to figure
 out
 all the corner cases present in the current spaghetti. ☺

>>> That'd work nicely if we weren't doing the operations modulo
>>> Double.MIN_VALUE. Is there any reason why our locations doubles and
>>> not
>>> BigDecimals?
>>>
>>> http://docs.oracle.com/javase/7/docs/api/java/math/BigDecimal.html
>>>
>>> Toad?
>> Originally speed. Of course we can look up nodes by a Double - but we
>> need to look for >= and <=, so it's more complicated, and I don't
>> recall
>> whether the API allows it cheaply (you used to have to create a bunch
>> of
>> objects e.g. iterators etc to get next-key-after-target). I agree
>> that
>> being able to do an exact match might be useful for some
>> optimisations.
>>> Any objection to that being changed?
>> How about using a long? More bits than a double, at least for the
>> vast
>> majority of the space. And we can map 0.5 to 0, so sign isn't a big
>> deal.
>>
>> Note that we use locations in both routing and swapping, so it might
>> affect more code.
> I've written some code overnight: 
> https://github.com/nextgens/fred/blob/optimize-closerpeer/test/freenet/
> node/LocationTest.java
> https://github.com/nextgens/fred/blob/optimize-closerpeer/src/freenet/n
> ode/Location.java
>
> It sort-of-works; the exclusion list is matched exactly as opposed to
> modulo Double.MIN_VALUE... is that good enough?
And the exclusion list is a set of peers' peers which we assume we've
already routed to?

I still think you should separate optimising routing (trees of all
locations etc, long's instead of doubles) from changing behaviour
(ignoring FOAF peers if they were also the peers of a node we've already
visited). Does our FOAF locations-of-peers announcement include nodes
which are backed off for long periods? Nodes which are disconnected? I
guess there's no real security issue at least - FOAF can already capture
traffic easily, and solutions to this rely on either arbitrary
proportional thresholds or clustering - therefore probably not
applicable for long links/local traffic.
> Florent



signature.asc
Description: OpenPGP digital signature
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Routing question

2015-10-19 Thread Florent Daigniere
On Sun, 2015-10-18 at 20:08 +0100, Matthew Toseland wrote:
> On 18/10/15 14:33, Florent Daigniere wrote:
> > On Tue, 2015-10-13 at 22:26 +0200, Bert Massop wrote:
> > > On 10-10-15 21:01, nextg...@freenetproject.org wrote:
> > > > Oi!
> > > > 
> > > > Given https://github.com/freenet/fred/blob/next/src/freenet/nod
> > > > e/Pe
> > > > erManager.java#L1049
> > > > 
> > > > We seem to get our peers location and their peer's location...
> > > > Iterate over the whole lot (150^2 entries max)
> > > > ...
> > > > And then for each of them, ensure that they haven't been routed
> > > > to
> > > > before by iterating over a list of locations we've already
> > > > routed
> > > > to (18 is the max HTL)
> > > > ...
> > > > That's 18*150^2 => 405k iterations in the worst case ... on the
> > > > hot-path (each request we route will go through it)!
> > > Nice find!
> > > One more thing: you forgot the recentlyFailed case, where it
> > > recurses
> > > twice (line 1173 and 1183), leading to a total of 1.2M iterations
> > > worst
> > > case (however this case is extremely unlikely to occur).
> > > 
> > > > It seems very naive.
> > > > 
> > > > How can we do better? Locations are double, what's the best
> > > > data-
> > > > structure here?
> > > > I'm rusty: is a TreeMap with a custom comparator (NavigableMap)
> > > > what we want?
> > > At least keeping FOAF peers locations sorted would allow for
> > > binary
> > > search, i.e. O(log n) instead of O(n) search, at the cost of an
> > > initial
> > > O(n log n) sort. That would however require reworking
> > > PeerNode.getPeersLocation() to return (and maintain) a sorted
> > > array.
> > > 
> > > If one would then also keep the routedTo (of size m) sorted by
> > > location,
> > > we can find the FOAF (assuming peer count n for ourself and all
> > > of
> > > our
> > > peers) closest to the target location not in routedTo in O(log n
> > > + m)
> > > time for each of our peers, and when done intelligently, find our
> > > peer
> > > with the FOAF closest to the target O(n log n + nm + m log m) for
> > > *all*
> > > of our peers. That would bring the iteration count down around
> > > 100-
> > > fold
> > > from the current situation.
> > > 
> > > For any further optimization, the hardest part would be to figure
> > > out
> > > all the corner cases present in the current spaghetti. ☺
> > > 
> > That'd work nicely if we weren't doing the operations modulo
> > Double.MIN_VALUE. Is there any reason why our locations doubles and
> > not
> > BigDecimals?
> > 
> > http://docs.oracle.com/javase/7/docs/api/java/math/BigDecimal.html
> > 
> > Toad?
> Originally speed. Of course we can look up nodes by a Double - but we
> need to look for >= and <=, so it's more complicated, and I don't
> recall
> whether the API allows it cheaply (you used to have to create a bunch
> of
> objects e.g. iterators etc to get next-key-after-target). I agree
> that
> being able to do an exact match might be useful for some
> optimisations.
> > Any objection to that being changed?
> How about using a long? More bits than a double, at least for the
> vast
> majority of the space. And we can map 0.5 to 0, so sign isn't a big
> deal.
> 
> Note that we use locations in both routing and swapping, so it might
> affect more code.
> > 

I've written some code overnight: 
https://github.com/nextgens/fred/blob/optimize-closerpeer/test/freenet/
node/LocationTest.java
https://github.com/nextgens/fred/blob/optimize-closerpeer/src/freenet/n
ode/Location.java

It sort-of-works; the exclusion list is matched exactly as opposed to
modulo Double.MIN_VALUE... is that good enough?

Florent
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Routing question

2015-10-18 Thread Matthew Toseland
On 18/10/15 14:33, Florent Daigniere wrote:
> On Tue, 2015-10-13 at 22:26 +0200, Bert Massop wrote:
>> On 10-10-15 21:01, nextg...@freenetproject.org wrote:
>>> Oi!
>>>
>>> Given https://github.com/freenet/fred/blob/next/src/freenet/node/Pe
>>> erManager.java#L1049
>>>
>>> We seem to get our peers location and their peer's location...
>>> Iterate over the whole lot (150^2 entries max)
>>> ...
>>> And then for each of them, ensure that they haven't been routed to
>>> before by iterating over a list of locations we've already routed
>>> to (18 is the max HTL)
>>> ...
>>> That's 18*150^2 => 405k iterations in the worst case ... on the
>>> hot-path (each request we route will go through it)!
>> Nice find!
>> One more thing: you forgot the recentlyFailed case, where it recurses
>> twice (line 1173 and 1183), leading to a total of 1.2M iterations
>> worst
>> case (however this case is extremely unlikely to occur).
>>
>>> It seems very naive.
>>>
>>> How can we do better? Locations are double, what's the best data-
>>> structure here?
>>> I'm rusty: is a TreeMap with a custom comparator (NavigableMap)
>>> what we want?
>> At least keeping FOAF peers locations sorted would allow for binary
>> search, i.e. O(log n) instead of O(n) search, at the cost of an
>> initial
>> O(n log n) sort. That would however require reworking
>> PeerNode.getPeersLocation() to return (and maintain) a sorted array.
>>
>> If one would then also keep the routedTo (of size m) sorted by
>> location,
>> we can find the FOAF (assuming peer count n for ourself and all of
>> our
>> peers) closest to the target location not in routedTo in O(log n + m)
>> time for each of our peers, and when done intelligently, find our
>> peer
>> with the FOAF closest to the target O(n log n + nm + m log m) for
>> *all*
>> of our peers. That would bring the iteration count down around 100-
>> fold
>> from the current situation.
>>
>> For any further optimization, the hardest part would be to figure out
>> all the corner cases present in the current spaghetti. ☺
>>
> That'd work nicely if we weren't doing the operations modulo
> Double.MIN_VALUE. Is there any reason why our locations doubles and not
> BigDecimals?
>
> http://docs.oracle.com/javase/7/docs/api/java/math/BigDecimal.html
>
> Toad?
Originally speed. Of course we can look up nodes by a Double - but we
need to look for >= and <=, so it's more complicated, and I don't recall
whether the API allows it cheaply (you used to have to create a bunch of
objects e.g. iterators etc to get next-key-after-target). I agree that
being able to do an exact match might be useful for some optimisations.
> Any objection to that being changed?
How about using a long? More bits than a double, at least for the vast
majority of the space. And we can map 0.5 to 0, so sign isn't a big deal.

Note that we use locations in both routing and swapping, so it might
affect more code.
> Florent



signature.asc
Description: OpenPGP digital signature
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Routing question

2015-10-18 Thread Florent Daigniere
On Tue, 2015-10-13 at 22:26 +0200, Bert Massop wrote:
> On 10-10-15 21:01, nextg...@freenetproject.org wrote:
> > Oi!
> > 
> > Given https://github.com/freenet/fred/blob/next/src/freenet/node/Pe
> > erManager.java#L1049
> > 
> > We seem to get our peers location and their peer's location...
> > Iterate over the whole lot (150^2 entries max)
> > ...
> > And then for each of them, ensure that they haven't been routed to
> > before by iterating over a list of locations we've already routed
> > to (18 is the max HTL)
> > ...
> > That's 18*150^2 => 405k iterations in the worst case ... on the
> > hot-path (each request we route will go through it)!
> 
> Nice find!
> One more thing: you forgot the recentlyFailed case, where it recurses
> twice (line 1173 and 1183), leading to a total of 1.2M iterations
> worst
> case (however this case is extremely unlikely to occur).
> 
> > It seems very naive.
> > 
> > How can we do better? Locations are double, what's the best data-
> > structure here?
> > I'm rusty: is a TreeMap with a custom comparator (NavigableMap)
> > what we want?
> 
> At least keeping FOAF peers locations sorted would allow for binary
> search, i.e. O(log n) instead of O(n) search, at the cost of an
> initial
> O(n log n) sort. That would however require reworking
> PeerNode.getPeersLocation() to return (and maintain) a sorted array.
> 
> If one would then also keep the routedTo (of size m) sorted by
> location,
> we can find the FOAF (assuming peer count n for ourself and all of
> our
> peers) closest to the target location not in routedTo in O(log n + m)
> time for each of our peers, and when done intelligently, find our
> peer
> with the FOAF closest to the target O(n log n + nm + m log m) for
> *all*
> of our peers. That would bring the iteration count down around 100-
> fold
> from the current situation.
> 
> For any further optimization, the hardest part would be to figure out
> all the corner cases present in the current spaghetti. ☺
> 

That'd work nicely if we weren't doing the operations modulo
Double.MIN_VALUE. Is there any reason why our locations doubles and not
BigDecimals?

http://docs.oracle.com/javase/7/docs/api/java/math/BigDecimal.html

Toad?

Any objection to that being changed?

Florent

signature.asc
Description: This is a digitally signed message part
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Routing question

2015-10-13 Thread Bert Massop
On 10-10-15 21:01, nextg...@freenetproject.org wrote:
> Oi!
> 
> Given 
> https://github.com/freenet/fred/blob/next/src/freenet/node/PeerManager.java#L1049
> 
> We seem to get our peers location and their peer's location...
> Iterate over the whole lot (150^2 entries max)
> ...
> And then for each of them, ensure that they haven't been routed to before by 
> iterating over a list of locations we've already routed to (18 is the max HTL)
> ...
> That's 18*150^2 => 405k iterations in the worst case ... on the hot-path 
> (each request we route will go through it)!

Nice find!
One more thing: you forgot the recentlyFailed case, where it recurses
twice (line 1173 and 1183), leading to a total of 1.2M iterations worst
case (however this case is extremely unlikely to occur).

> It seems very naive.
> 
> How can we do better? Locations are double, what's the best data-structure 
> here?
> I'm rusty: is a TreeMap with a custom comparator (NavigableMap) what we want?

At least keeping FOAF peers locations sorted would allow for binary
search, i.e. O(log n) instead of O(n) search, at the cost of an initial
O(n log n) sort. That would however require reworking
PeerNode.getPeersLocation() to return (and maintain) a sorted array.

If one would then also keep the routedTo (of size m) sorted by location,
we can find the FOAF (assuming peer count n for ourself and all of our
peers) closest to the target location not in routedTo in O(log n + m)
time for each of our peers, and when done intelligently, find our peer
with the FOAF closest to the target O(n log n + nm + m log m) for *all*
of our peers. That would bring the iteration count down around 100-fold
from the current situation.

For any further optimization, the hardest part would be to figure out
all the corner cases present in the current spaghetti. ☺

— Bert



signature.asc
Description: OpenPGP digital signature
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl