Took the dog for a walk, which helps w/ thinking. So .. a very short reply (as short as I can make it) and will ponder the entire email chain tomorrow morning.
The meta-question I pondered during the walk was "what can be prototyped in a few days/weeks?" and the answer becomes simpler (because the choices are fewer). Two parts. Part one: create a custom atom (UpLink (Atom X) (Number N)) and it will return the incoming set up to N steps upward from X. This atom has a unique hash, and so can always be found. What should a remote server do, when asked for this? Well, it could just do the look-up, then and there, and return the results. Alternately, the remote node can do the lookup, attach the results, together with a timestamp, as a Value on that UpLink, and return that. That way, you know how old/stale your results are. Part two: Oh wait, we already have UpLink. It's called JoinLink. Part three: Oh wait, we could do this with an arbitrary BindLink/GetLink. We could take the most recent search results, attach a timestamp to the results and attach that as a Value on some key. That way, you can ask the network for that Bind/Get, and if it comes back with a new-enough timestamp, you can be happy and just use the results, and not re-perform the search. If you're not happy with the results, you can re-run the pattern match, and publish your latest/greatest results to the world. (Or rather, you attach the results to the Bind/Get, and announce "I too have a copy of this atom") There are still a few holes in what I describe above, but maybe they're not serious. Not sure. My sense is that some variant of this can be prototyped in not much time at all. It might even be usable for the genomics work, where the data is almost totally static, where many searches tend to be built on sub-searches which can be cached, and do not have to be recomputed each time. Currently, we cache in a scheme wrapper, but caching search results as a Value on the Get/BindLink itself makes more sense. (FWIW this kind of caching was briefly done for openpsi, many years ago, but fell into disuse. Amen might remember details, I don't. I just remember that caching made sense, at the time.) Habash was thinking about a server for the genomics data, but I think he was going in a different direction. But maybe this works for him? --linas On Thu, Jul 23, 2020 at 10:20 PM Ben Goertzel <b...@goertzel.org> wrote: > Differently but indirectly relatedly, this caching system for graph > queries looks interesting, > > https://openproceedings.org/2017/conf/edbt/paper-119.pdf > > On Thu, Jul 23, 2020 at 8:15 PM Ben Goertzel <b...@goertzel.org> wrote: > > > > Matt, > > > > So regarding these requirements, > > > > > 1. Some cluster node will "own" each atom by assignment via some > simple division of the hash address space. > > > 2. Each cluster node will also contain replicas of many other atoms, > not only for disaster recovery purposes, but also because mind agents on > that node will need in local memory many atoms "owned" by other nodes. Once > we've obtained them from their owners, we might as well keep them around > until we need to recover memory space for other "borrowed" atoms more > urgently needed. > > > 3. A mind agent on a given node wants to be able to update atom > properties (truth value, etc) locally, without having to talk to the > "owner" node directly. > > > 4. Perfect consistency of atom state between different nodes is not a > strict requirement, but it is desirable for a node to be able to identify > the 'authoritative' source for a given atom, and that source should reflect > a reasonably recent state of the atom as updated by any replica node. > > > 5. Relatively poor storage efficiency is acceptable. I.e., a single > node may only be able to dedicate a relatively small portion of its memory > to storing the atoms it owns; a majority of its space may go to replicated > atoms. Nodes are cheap; we'll just buy more. :-) > > > > > > Given those design goals, I think we're looking at a publish-subscribe > model for replicating updates to atoms. > > > > > > -- what Linas and Cassio and Senna have all posited, is that it may be > > more sensible to replace "Atom" with "Chunk" (i.e. sub-metagraph) in > > the above requirements.. > > > > What the references I sent in my just-prior email suggest is that, for > > the sorts of graphs that tend to be created in real life, defining > > Chunks in a fairly simple heuristic way (i.e. each chunk is just a > > bunch of tightly-ish connected nodes and links) rather than via > > running an expensive partitioning algorithm will generally be > > adequate. > > > > The requirements you state are in my view correct as regards Atoms. > > However, the perspective being put forth is that handling these > > requirements explicitly on the level of Atoms rather than Chunks will > > become computationally intractable given the number of Atoms involved > > and the dynamic nature of the Atomspace. > > > > -- Ben > > > > -- > Ben Goertzel, PhD > http://goertzel.org > > “The only people for me are the mad ones, the ones who are mad to > live, mad to talk, mad to be saved, desirous of everything at the same > time, the ones who never yawn or say a commonplace thing, but burn, > burn, burn like fabulous yellow roman candles exploding like spiders > across the stars.” -- Jack Kerouac > > -- > You received this message because you are subscribed to the Google Groups > "opencog" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to opencog+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/opencog/CACYTDBcKdcwq%3DBpZ9dS3p4B9-stHF3BOAj5LqngcsLL1%3DQVmMg%40mail.gmail.com > . > -- Verbogeny is one of the pleasurettes of a creatific thinkerizer. --Peter da Silva -- You received this message because you are subscribed to the Google Groups "opencog" group. To unsubscribe from this group and stop receiving emails from it, send an email to opencog+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAHrUA34AfV5UbAJA2M0VzRe-iJtkg-Ska0xq3zWY-iEYW6babA%40mail.gmail.com.