Not sure if I am following you...

In order to define the optimal control problem, you need:

   - State set: Set of all possible logic propositions. OK
   - Action set: Logic rules. It is not clear to me what this means. Can
   you choose which logic rule to use with which proposition? I mean, the
   actions should be chosen by the agent (there may be constraints on which
   actions are available at each state, but there might be some freedom
   nonetheless, otherwise, there wouldn't be anything to be learned).
   - Expected reward function: some map from the state and action sets to
   the reals. You want it to be non-smooth. OK.
   - Transition kernel: represents the knowledge. Very interesting.

So let me try to understand with an example:

   - State at time t is a bunch of propositions', e.g. x_t = {"I am in my
   place", "my place is in Europe"}
   - Action at time t is a particular logic rule, e.g. a_t = { "if p --> q
   and  q-->r , then p-->r"}
   - State transition: x_{t+1} = F(x_t, a_t) = if "I am in my place" and
   "my place is in Europe", then "I am in Europe"
   - Reward: something saying that this new state is desirable, makes
   sense, etc.

Is this correct?

I am definitely lost with your comment about the Hamiltonian. I am familiar
with optimal control theory, but I don't see the story... In general you
don't need the velocity. What you need is an optimality condition, which
doesn't have to be related to any time derivative. Think, e.g. about the
Euler-Lagrange condition by deriving the reward function with respect to
the current and future states and with respect to the action. It can be
formulated even in discrete time.

Best


On Tue, May 14, 2019 at 1:52 PM YKY (Yan King Yin, 甄景贤) <
generic.intellige...@gmail.com> wrote:

> On Sun, May 12, 2019 at 10:22 PM Sergio VM <serteck...@gmail.com> wrote:
>
>> Hi King Yin,
>>
>> The architecture looks very interesting. I am just missing the definition
>> of the reward function (or kernel if you make it stochastic).
>>
>> On the other hand, I don't understand your previous comment on the
>> Lagrangian and Hamiltonian. I haven't seen the previous version of the
>> paper. But you can apply an optimal control approach without having to
>> consider the velocity at all.
>>
>
>
> The reward function is given externally by some "AI teachers".  For
> example, rewards given by an Atari game.
>
> The Lagrangian is the same as the instantaneous reward the system gets at
> time t.  In some cases, such as chess, the reward is just a delta function
> given at the terminal state (eg checkmate).  The Bellman equation (for
> dynamic programming) always works, no matter how the rewards are given.
> The Hamiltonian / Lagrangian control theory may also work if the Lagrangian
> is given as some delta functions, but in such a case, the solution of the
> differential equation would involve a discretization process that simply
> reduces to the discrete dynamic programming case.  In other words, the use
> of differential equations has no advantage over the discrete case!  Things
> would be different if the reward (Lagrangian) is differentiable against the
> position x and velocity x dot.  But such is not the case for some real-life
> problems such as logic puzzles or chess games, in which the reward only
> occurs sparsely.  Hope it answers your question?☺
>
> I will re-organize the material in the older version and post it
> somewhere, just so the work is not wasted.  But I don't see any easy way to
> bridge that gap.  It doesn't seem to be a good idea to temper with the
> reward function, other than the way it is given by the problem setup....
> *Artificial General Intelligence List <https://agi.topicbox.com/latest>*
> / AGI / see discussions <https://agi.topicbox.com/groups/agi> +
> participants <https://agi.topicbox.com/groups/agi/members> + delivery
> options <https://agi.topicbox.com/groups/agi/subscription> Permalink
> <https://agi.topicbox.com/groups/agi/T3cad55ae5144b323-M786f293ff3d94b5c7bbc2660>
>

------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T3cad55ae5144b323-M3f8d5ea29ae6b8b9776cd287
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to