If you can 'predict' something because you 'understand' what you need
to know about it then you are saying that 'prediction' follows
'understanding'. I would use the term 'expectation' just because the
word prediction sounds like an exaggeration. But it doesn't make sense
to say that someone can 'understand' what someone else is saying
because he can 'predict' a large number of the words he is going to
use. That is nonsense. And as I pointed out it is just not true.
Jim Bromer


On Sat, Nov 29, 2014 at 5:29 PM, Matt Mahoney via AGI <[email protected]> wrote:
> On Fri, Nov 28, 2014 at 11:54 PM, Logan Streondj <[email protected]> wrote:
>> On Thu, Nov 27, 2014 at 10:35:03PM -0500, Matt Mahoney wrote:
>>> >> > You
>>> >> > are able to understand my words because you can predict a large
>>> >> > fraction of them and only need to remember the differences.
>>> >
>>> > that just sounds like a nonsensical statement.
>>>
>>> If you disagree, then try taking some of my words and scrambling them
>>> in random order and see which sentence is easier to remember. Then try
>>> scrambling the letters in random order. Do you see that the task
>>> becomes progressively harder because you are not able to predict the
>>> next word or next letter?
>>
>> er if words are in random order, they aren't grammatical,
>> if letters are in random order, they aren't vocabulary.
>> without grammar and vocabulary, there is no langauge.
>
> Again we are disagreeing on the semantics of the word "predict".
> Semantics, grammar, and vocabulary are sets of rules that we use to
> assign probabilities to strings of text such as sentences. If I
> scramble the words or letters, then they violate these rules and the
> probability is reduced. To understand how this makes prediction
> harder, you can use the chain rule, which states that for any string
> xy (x concatenated with y):
>
> P(xy) = P(x) P(y|x)
>
> and you can likewise split x and y into smaller strings and repeat,
> which gives you:
>
> P(x_1..n) = product over i in 1..n of P(x_i | x_1..i-1).
>
> Thus, the probability of any sentence can be expressed as the product
> of the conditional probabilities of each word, letter, or bit, given
> the parts of the sentence you have already seen. Since we usually
> input the words or letters sequentially over time, then recognizing a
> sentence as correct is the same as assigning it a high probability,
> which is the same as predicting it with a high rate of accuracy.
>
> There is one more point: prediction implies understanding. If I want
> to test if you understand something I say, I would ask you to repeat
> it back to me. Obviously you can't do that if you can't remember it.
> Your short term memory capacity is about 100 bits, or enough to repeat
> back a list of about 7 random words. But you can remember sentences
> longer than 7 words if they obey language rules because you don't have
> to remember the whole thing. You only have to remember the difference
> between what I said and what you predicted I would say. Typical
> written English has an entropy of about 1 bit per character relative
> to your language model. This allows you to repeat back well formed
> sentences up to about 20 words.
>
> --
> -- Matt Mahoney, [email protected]
>
>
> -------------------------------------------
> AGI
> Archives: https://www.listbox.com/member/archive/303/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/303/24379807-653794b5
> Modify Your Subscription: https://www.listbox.com/member/?&;
> Powered by Listbox: http://www.listbox.com


-------------------------------------------
AGI
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657
Powered by Listbox: http://www.listbox.com

Reply via email to