On Mon, May 9, 2022, 8:12 AM Undiscussed Horrific Abuse, One Victim of Many <gmk...@gmail.com> wrote:
> > > On Mon, May 9, 2022, 8:05 AM Undiscussed Horrific Abuse, One Victim of > Many <gmk...@gmail.com> wrote: > >> To represent normal goal behavior with maximization, the >>>>>>>> >>>>>>> > This is all confused to me, but normally when we meet goals we don't > influence things not related to the goal. This is not usually included in > maximization, unless > > return function needs to not only be incredibly complex, but >>>>>>>> >>>>>>> > the return to be maximized were to include them, by maybe always being > 1.0, I don't really know. > > also feed back to its own evaluation, in a way not >>>>>>>> >>>>>>> > Maybe this relates to not learning habits unrelated to the goal, that > would influence other goals badly. > > provided for in these libraries. >>>>>>>> >>>>>>> > But something different is thinking at this time. It is the role of a part > of a mind to try to relate with the other parts. Improving this in a > general way is likely known well to be important. > > >> Daydreaming: I'm thinking of how in reality and normality, we have many >> many goals going at once (most of them "common sense" and/or "staying being >> a living human"). Similarly, I'm thinking of how with normal transformer >> models, one trains according to a loss rather than a reward. >> >> I'm considering what if it were more interesting when an agent _fails_ to >> meet a goal. Its reward would usually be full, 1.0, but would multiply by >> losses when goals are not met. >> >> This seems much nicer to me. >> > I don't know how RL works since I haven't taken the course, but it looks to me from a distance like it would just learn at a different (slower) rate [with other differences] > yes >