On Sat, 2002-11-30 at 02:41, Erendil at aol.com wrote: > > Ok, now, I'm going to go through a possible sample of TUK insertation and > request, minus the date part of the timestamp as there is no need for that in > a simple explanation. > > a. Alabaster has index.html > b. Alabaster uploads index.html as a TUK with time 1:00am to NodeA > c. We will refer to this key as TUK at blah/index.html > d. NodeA does not have any TUK at blah/index.html > e. NodeA stores it. > f. NodeA passes it to NodeB who does not have TUK at blah/index.html > g. NodeB stores it. > h. NodeC wants this content he heard about, TUK at blah/index.html > i. NodeC asks NodeA for TUK at blah/index.html > j. NodeC receives TUK at blah/index.html and stores TUK at blah/index.html > h. Alabaster wants to add his new blog entry so he changes index.html. > i. Now, Alabaster uploads index.html as TUK at blah/index.html with > timestamp > 2:00am to NodeA > j. NodeA has an older version of TUK at blah/index.html, so it replaces > the > old > TUK at blah/index.html with the new TUK at blah/index.html > k. NodeA cannot reach NodeB, so the new TUK at blah/index.html is not sent > l. NodeD wants TUK at blah/index.html now. > m. NodeD asks NodeB for TUK at blah/index.html. It receives the old one > (1:00am) > and stores it. > n. NodeB decides it wants to read this phenomenon also. So it remembers > that > NodeA has it. It gets the new TUK at blah/index.html and replaces the > old > one.
n is really iffy. "Remembering" sources of insert requests is not very helpful because connections are not guaranteed to be persistent. What if, instead of connecting directly from A->B, the routing was A->E->F->G->H->B. A request for the content winds up at B and finds that B has a version of TUK at blah/index.html. A is the only node on the insert path that has the most recent version of the file. If the request fails with any of the nodes between A and B (i.e. if H was down), B wouldn't be able to find the most recent version. If A inserts the new file, it is not guaranteed to take the same path, so it might go A->F->J->K->L. B could keep the request going until it gets to one of those nodes, but that means that you'd have to keep the request going past those nodes because you're not even sure if the data on those nodes is the most recent either. Each request would have to go the full HTL to get the most recent documents. Here's an example of my idea for TRKs: a. Alabaster has index.html b. Alabaster uploads TRK at blah/index.html at 1:00 AM with an estimated update time of 2:00 AM c. Alabaster uses an HTL of 10, so it goes through nodes A->B->C->D->E->F->G->H->I->J and they all store it (let's say Alabaster is node A). d. At 1:30, Alice (Node A2) wants Alabaster's site. Let's say this Request goes A2->B2->C2-D2->G. When it gets to G, G checks the current time with the estimated update time on the key. Since it isn't supposed to be updated until 2:00 AM, the request doesn't need to go further. G returns with his version of the data.His e. At 2:05, Alabaster hasn't updated his site, and Bob (Node A3) wants it. He requests it with an HTL of 10. The request might go like A3->B3->H->I->J->K->L->M->N->O. It goes the full HTL, but does not find a published version more recent than the 1:00 AM version. It does though, return with the 1:00 AM version, which is currently the most recent version. f. Alabaster has his new TRK at blah/index.html, and finally publishes it at 2:30 AM. He is tired, and doesn't think he's going to update it anytime soon, so he sets the estimate-next-update-time to 6:00 PM that same day. Let's say his insert goes through A->B->B2->E->G->I->J->K->L->M g. At 3:00 AM, Carol (Node A4) requests TRK at blah/index.html. Lets say his request goes A4->B4-D2->F->I. D2 and F only have the 1:00 AM version, so they continue looking because it's after 2:00 AM (the estimated update for the 1:00 AM insert). Node I has the most recent version, and since it's before 6:00 PM, it doesn't look any further. D2 and F are updated with the most recent version. h. Now here's a more complex (and hopefully realistic) part. Carl wakes up early and gets all his work done early, so he's ready to update his site at 3:00 PM. He still thinks he'll update it at 6:00 again, so he inserts his site at 3:00 PM with his estimate-update time set at 6:00 PM again. His insert goes A->Z->Y->E->F->I->O->M->L->N. i. Now if some random node off the network requests the data (before 6:00 PM), they may or may not get the most recent version. It depends on whether the request for that node hits one of the nodes with the 3:00 PM version or the 3:00 AM version. j. At 6:00 PM, Alabaster is late again and doesn't update his page until 7:00 PM. Between 6:00 PM and 7:00 PM, all requests for his content will go the full HTL. Those nodes will get the most recent version because 3:00 PM is more recent than 3:00 AM, and only one node on the request chain has to have the 3:00 PM version for a node to get it. The difference between TRKs and TUKs is that TRKs don't always need to go the full HTL. They would basically do the same thing if an author always sets his/her estimated update time to, say, Midnight January 1, 2000. With both ideas, the data for new content needs to be routed to the same place as the old content, so that a single routing key can find the data. Both would require changes to FNP, so I don't think either will happen anytime soon. -Scott Young _______________________________________________ devl mailing list devl at freenetproject.org http://hawk.freenetproject.org/cgi-bin/mailman/listinfo/devl
