Regarding the genomics use-cases, there are many but here are three which are clear and interesting
1) Genome annotation Here, we have a large bioAtomspace containing diverse information about e.g. human genes and their connections with proteins, diseases, etc. We then have a list of genes that results from some data-analysis, and we want to find all the info in the (distributed) Atomspace related to these genes. This is very similar to what the Gearman distributed processing setup that Mandeep etc. made years back was supposed to do (and probably still does if it hasn't bitrotted too badly). 2) DeepWalk embedding vector creation Here, we have the same bioAtomspace, and for each GeneNode (or e.g. each ConceptNode representing a GO category) we want to generate a large number of biased-random walks through that node. These walks are then processed by DeepWalk algorithm and used to generate embedding vectors. The walks of length say K=10 through a node are very likely to wander across multiple machines, in a distributed-Atomspace setting. 3) PLN inference We have an implication such as "overexpression of gene G in person P" ImplicationLink "Person P will live past 90" and we want to estimate its truth value using backward chaining inference. As the backward chainer does its thing, its premise-selection process will need to use distributed knowledge. For example, suppose the BC wants to evaluate the truth value of the premise P5= "Gene G IntensionalInheritance concept C" ... where concept C is perhaps some GO category that G does not belong to extensionally. The BC may determine that the data about this premise P5 lies most centrally on lobe M12 in the distributed Atomspace, and then it may ask lobe M12 to evaluate the truth value of P5 using PLN. Then, in evaluating P5, M12 may wind up wanting truth value evaluations of other premises than rely on data centered on other lobes... etc. ... These are all things we are doing right now on a localized Atomspace, but that would obviously merit from a distributed-Atomspace implementation. -- Ben On Wed, Jul 29, 2020 at 4:13 PM Ben Goertzel <b...@goertzel.org> wrote: > > >> Is there a public document somewhere describing actual, present use-cases > >> for distributed atomspace? Ideally with some useful guesses at performance > >> requirements, in terms of updates per second to be processed on a single > >> node and across the cluster, and reasonably estimated hardware specs (num > >> cores, ram, disk) per peer? > > A couple years ago we put together a document collecting together some > use-cases for Distributed Atomspace (not ones we are currently using > distributed Atomspace for, but stuff we're doing or have previously > done that could use Distributed Atomspace). I will dig up that > document and share it. -- Ben Goertzel, PhD http://goertzel.org “The only people for me are the mad ones, the ones who are mad to live, mad to talk, mad to be saved, desirous of everything at the same time, the ones who never yawn or say a commonplace thing, but burn, burn, burn like fabulous yellow roman candles exploding like spiders across the stars.” -- Jack Kerouac -- You received this message because you are subscribed to the Google Groups "opencog" group. To unsubscribe from this group and stop receiving emails from it, send an email to opencog+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CACYTDBd3WNY59FdVGS5OJMBCvH549E_Jmc5qjPJs0heyW4joHg%40mail.gmail.com.