Ongoing breakup of TL;DR into small pieces.

On Sun, May 20, 2018 at 10:53 AM, Alexey Potapov <pota...@aideus.com> wrote:

>
> Both tasks can be considered as a part of the Semantic Vision problem, but
> their solution can be useful in a more general context.
>
> *OpenCog + Tensorflow*
> Depth of OpenCog+Tensoflow integration can be quite different. Shallow
> integration implies that Tensorflow is used as an external module, and
> communication between Tensorflow and OpenCog is limited to passing
> activities of neurons, which are represented both by Tensorflow and
> Atomspace nodes.
> The most restricted way is just to run (pre-trained) TF models on input
> data and to set values of Atomspace nodes in correspondence with the
> activities of output neurons. What will be missing in this case: feedback
> connections from the cognitive level to the perception system; online (and
> joint) training of neural networks and OpenCog.
> Let us consider the Visual Question Answering (VQA) task as a motivating
> example. How will OpenCog be able to answer such questions as “What is the
> color of the dress of the girl standing to the left of the man in a blue
> coat?” If our network is pre-trained to detect and recognize all objects in
> the image and supplement them with detailed descriptions of colors, shapes,
> poses, textures, etc., then Pattern Matcher will be able to answer such
> questions (converted to corresponding queries). However, this approach is
> not computationally feasible: there are too many objects in images, and too
> many grounded predicates which can be applied to them.
>

Is that true? Maybe. Certainly, "color of a dress" is a long-term durable
property of a dress: it will not change for hours. In that sense, it is
appropriate to record it, statically, in the AtomSpace.

One form of autism, I am told, is that the brain is overwhelmed with
sensory data: one is seeing and hearing everything, and cannot focus on any
one thing.  Perhaps this could become a risk for the atomspace. But -- "too
many objects in images, and too many grounded predicates" -- How many are
we talking about, here? dozens, hundreds of objects? hundreds of predicates
per object? That is 100x100 = 10K and, currently, you can create and add
maybe 100K atoms/sec to the atomspace (via C++, less by scheme, python, due
to wrapper overhead). So this seems manageable.

Of course, it can be much more efficient to "not notice something until
someone asks you about it". And then you can respond, and say "Hey, I never
noticed that before, but yes, now that you asked, I can now clearly see
that her dress is blue".   My son was trampling over flowers, the other
day, which, for some reason, he had not noticed until I pointed at them.
Odd, since they were bright blue, albeit quite small.


> Thus, the question should influence the process of how the image is
> interpreted.
> For example, even if we detected bounding boxes (BBs) for all objects and
> inserted them into AtomSpace, predicate “left to” is not immediately
> evaluated to all pairs of BBs. Instead, it will be evaluated during query
> execution by Pattern Matcher (hopefully) only for relevant BBs labeled as
> “girl” and “man”.
>
Yes.


> Similarly, grounded predicate “is blue” implemented by a neural subnetwork
> can be computed only in the course of query execution meaning that the work
> of Pattern Matcher should be extended to neural network levels.
>

There is a generic mechanism called "GroundedPredicateNode", and it can
call arbitrary C++/scheme/python/haskell code, which must return a
true/false value.  True means "yes, match and continue with the rest of the
query".

Unfortunately, GroundedPredicateNodes are "black boxes"; we do not know
what is inside. Thus, it is useful to sometimes define "clear boxes":  for
example: GreaterThanLink.  The GreaterThanLink can handle an infinite
number of inputs, but it is not a black box: we know exactly what kind of
inputs it expects, what it produces, what it does.  Thus, it is possible to
perform logical reasoning on GreaterThanLinks, and/or perform algebraic
simplification (a<b<c implies a<c, etc)



> Indeed, purely DNN solutions for VQA usually implement some top-down
> processes at least in the form of attention mechanisms.
> Apparently, a cognitive feedback to perception is necessary for AGI in
> general.
> It is not a problem to feed Tensorflow models with data generated by
> OpenCog via placeholders, but OpenCog will also need some interface for
> executing computational graphs in Tensorflow. This can be done by binding
> corresponding Session.run calls with Grounded Predicate/Schema nodes.
>

.. or to Values.

>
> The question is how to combine OpenCog and neural networks on the
> algorithmic level. Let us return to the considered request for VQA. We can
> imagine a grounded schema node, which detects all bounded boxes with a
> given class label, and inserts them into Atomspace,
>

For example, one creates a ConceptNode "dress".  One also creates a
PredicateNode "*-bounding-box-*"  Then one writes C++ code to implement the
TensorFlowBBValue object.   One then associates all three:

(cog-set-value! (Concept "dress") (Predicate "*-bounding-box-*")
(TensorFlowBBValue "obj-id-42"))

What is the current bounding box for that dress?  I don't know, but I can
find out:

(cog-value->list (cog-value (Concept "dress") (Predicate
"*-bounding-box-*")))

returns 2 or 4 floating point numbers, as a list.    Is Susan wearing that
dress?

(cog-set-value! (Concept "Face-of-Susan") (Predicate "*-bounding-box-*")
(TensorFlowBBValue "obj-id-66"))

(is-near? A B)  (> 0.1 distance (cog-value A (Predicate
"*-bounding-box-*")) (cog-value B (Predicate "*-bounding-box-*"))

returns true if there is less than 0.1 meters distance between the bounding
boxes on A and B.

The actual location of the bounding boxes is never stored, and never
accessed, unless the is-near? predicate runs.



> so Pattern Matcher or Backward Chainer can further evaluate some grounded
> predicates over them finally finding an answer to the question. However,
> the question can be “What is the rightmost object in the scene?” In this
> case, we don’t expect our system to find all objects, but rather to examine
> the image starting from its right border.
>

This is uglier, and there are several reasonable solutions.  It depends on
whether or not you want to waste CPU cycles maintaining a left-right sorted
list of objects, or not.  Performing ten sorts per second is expensive, if
you are almost never  interested in the right-most object.


> We can imagine queries supposing other strategies of image
> processing/examination. In general, we would like not to hardcode all
> possible cases, but to have a general mechanism, which can be trained to
> execute different queries.
>
Yes.


>
> To make neural networks transparent for Pattern Matcher, we need to make
> nodes of Tensorflow also habitants of Atomspace.
>
Yes.


> The same is needed for a general case of unsupervised learning. In
> particular, architecture search is needed in order to achieve better
> generalization with neural networks or simply to choose an appropriate
> structure of the latent code. Thus, OpenCog should be able to add or
> deleted nodes in Tensorflow graphs.
>

Yes. Atoms are best used for representing graphs and relationships that are
stable over long periods of time (more than a few seconds, on
current-generation CPU's)

>
> These nodes correspond not just to neural layers, but also to operations
> over them. One can imagine TensorNode nodes connected by PlusLink,
> TimesLink, etc..
>

Yes.  However, we might also need PlusValue or TimesValue.  I do not know
why, yet, but these are potentially useful, as well.


> There can be tricky technical issues with Tensorflow (e.g. compilation of
> dynamical graphs), but they should be solvable.
> A conceptual problem consists in that fact that Pattern Matcher work with
> Atoms, but not with Values. Apparently, activities of neurons should be
> Values. However, evaluation of, e.g. GreaterThanLink requires NumberNode
> nodes.
>

This is a historical accident. GreaterThanLink and NumberNodes were
invented long before the idea of Values became clear.  Now that the
usefulness of Values is becoming clear, its time to redesign
GreaterThanLink.

Perhaps we need an IsLeftOfLink that knows automatically to obtain the
"*-centroid-*" value on two atoms, and then return true/false depending on
the result (or throw exception if there is no *-centroid-* value.)

The pattern matcher can then work, without any modification at all, with
IsLeftOfLink.  I assume the same would be true for URE/PLN.

Bonus: because IsLeftOfLink is a "clearbox" link, we can reason about it,
without actually having to access any values. We know that, "A left-of B
left-of C" implies that "A left-of C"

--linas


> Operations over (truth) values are usually implemented in Scheme within
> rules fed to URE. This might be enough for dealing with individual neuron
> activities as truth values and with neural networks as grounded predicates,
> but patterns in values cannot be matched or mined directly (while the idea
> of SynerGANs implied the necessity to mine patterns in activities of
> neurons of the latent code).
>
> I was going to illustrate by concreate the same kind of problems with
> implementing probabilistic programming with OpenCog, but I guess it's
> already TL;DR.
>
> So, briefly speaking, we need Pattern Matcher and Pattern Miner to work
> over Values/Valuations, that is not the case now (OpenCog uses only truth
> and attention values, and Atomese/Pattern Matcher doesn't have a built-in
> semantic even for them). I cite Linas here:
> "Atoms are:
>
> * slow to create, hard to destroy
>
> * are indexed and globally unique
>
> * are searchable
>
> * are immutable
>
>
> Values are:
>
> * fast and easy to create, destroy, change
>
> * values are highly mutable.
>
> * values are not indexed, are not searchable, are not globally unique."
>
> But we need "fast and easy to create, destroy, change, highly mutable, but
> searchable" entities. So, this is not only technical, but also conceptual
> problem...
>
> I would really like to hear your opinion on this. What should we do?
> Resort to the most shallow integration between OpenCog and DNNs? In this
> case, SynerGANs will not work since we will not be able to mine patterns in
> values, and we will not be able to use Pattern Matcher to solve VQA.
> Express output of DNNs as Atoms? Linas objected even the idea to express
> coordinates and lables of bounding boxes as Atoms. To do this with
> activities of neurons will be even worse. Put everything into Space-Time
> server? But the idea to use the power of Pattern Matcher, URE, etc. will
> not be achievable. Extend Pattern Matcher to work with Values? Maybe... /*I
> like the idea of embedding TF computational graph into Atomspace, but
> tf.mul works over Values (tensors) - not NumberNodes. Thus, in this case,
> it will be required to make all links (like TimesLink) to work not only
> with NumberNodes, but also with Values... but I foresee objections from
> Linas here... Also, I believe it should be useful in general since Values
> are not first-class objects in Atomese - you should use Scheme/Python/C to
> describe how to recalculate truth values; you cannot reason about them
> directly...
>
> Or should we try to use a sort of PPL as a bridge between Values and
> Atoms? Maybe... Or we should do something unifying all these.*/
>
>
> The question is not just about binding vision and PLN. It is more general.
> Say, if you driving a car, you estimate distances and velocities of other
> cars and take actions on this basis. These are also Values, and you
> 'reason' over them using both 'number crunching' and 'logic' simultaneously
> (I don't mean procedural knowledge here in sense of GroundedSchemaNode).
> So, I don't think that we should limit outselves to a shallow integration
> and use DNNs/PPL/etc. peripherically only...
>
>
> Ben Goertzel <b...@goertzel.org>:
>
>> if one stays in the world of finite discrete
>> distributions, one can construct probabilistic logics with
>> sampling-based semantics... https://arxiv.org/pdf/1602.06420.pdf
>>
>
> Sounds quite interesting. I'll study it in detail...
>
>  -- Alexey
>
>
>


-- 
cassette tapes - analog TV - film cameras - you

-- 
You received this message because you are subscribed to the Google Groups 
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to opencog+unsubscr...@googlegroups.com.
To post to this group, send email to opencog@googlegroups.com.
Visit this group at https://groups.google.com/group/opencog.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/opencog/CAHrUA351Wc2ZoqLfg6z_EJufTKd2S4NRc80zZkGsiMXn8tK6eA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to