Regarding the genomics use-cases, there are many but here are three
which are clear and interesting

1) Genome annotation

Here, we have a large bioAtomspace containing diverse information
about e.g. human genes and their connections with proteins, diseases,
etc.

We then have a list of genes that results from some data-analysis, and
we want to find all the info in the (distributed) Atomspace related to
these genes.

This is very similar to what the Gearman distributed processing setup
that Mandeep etc. made years back was supposed to do (and probably
still does if it hasn't bitrotted too badly).

2)  DeepWalk embedding vector creation

Here, we have the same bioAtomspace, and for each GeneNode (or e.g.
each ConceptNode representing a GO category) we want to generate a
large number of biased-random walks through that node.   These walks
are then processed by DeepWalk algorithm and used to generate
embedding vectors.

The walks of length say K=10 through a node are very likely to wander
across multiple machines, in a distributed-Atomspace setting.

3) PLN inference

We have an implication such as

"overexpression of gene G in person P" ImplicationLink "Person P will
live past 90"

and we want to estimate its truth value using backward chaining
inference.  As the backward chainer does its thing, its
premise-selection process will need to use distributed knowledge.

For example, suppose the BC wants to evaluate the truth value of the
premise P5= "Gene G  IntensionalInheritance concept C" ... where
concept C is perhaps some GO category that G does not belong to
extensionally.

The BC may determine that the data about this premise P5 lies most
centrally on lobe M12 in the distributed Atomspace, and then it may
ask lobe M12 to evaluate the truth value of P5 using PLN.

Then, in evaluating P5, M12 may wind up wanting truth value
evaluations of other premises than rely on data centered on other
lobes... etc.

...

These are all things we are doing right now on a localized Atomspace,
but that would obviously merit from a distributed-Atomspace
implementation.

-- Ben


On Wed, Jul 29, 2020 at 4:13 PM Ben Goertzel <b...@goertzel.org> wrote:
>
> >> Is there a public document somewhere describing actual, present use-cases 
> >> for distributed atomspace? Ideally with some useful guesses at performance 
> >> requirements, in terms of updates per second to be processed on a single 
> >> node and across the cluster, and reasonably estimated hardware specs (num 
> >> cores, ram, disk) per peer?
>
> A couple years ago we put together a document collecting together some
> use-cases for Distributed Atomspace (not ones we are currently using
> distributed Atomspace for, but stuff we're doing or have previously
> done that could use Distributed Atomspace).    I will dig up that
> document and share it.



-- 
Ben Goertzel, PhD
http://goertzel.org

“The only people for me are the mad ones, the ones who are mad to
live, mad to talk, mad to be saved, desirous of everything at the same
time, the ones who never yawn or say a commonplace thing, but burn,
burn, burn like fabulous yellow roman candles exploding like spiders
across the stars.” -- Jack Kerouac

-- 
You received this message because you are subscribed to the Google Groups 
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to opencog+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/opencog/CACYTDBd3WNY59FdVGS5OJMBCvH549E_Jmc5qjPJs0heyW4joHg%40mail.gmail.com.

Reply via email to