OK - I think their "y" is in fact a set of classes chosen from {1,2,3}
associated with
each X data point. Your data variable, classes, is similar in {0 1 2},
so you are selecting
the "correct" points for training purposes. Sorry about that!
M
On 15/05/2019 22:41, 'Mike Day' via Programming wrote:
A couple of further thoughts: NB. I've clipped your message from the
foot of this one.
a) I think they're generating values from a Normal distribution, so
perhaps
rand1 =: 0.01 * rnorm NB. would be better - your rand1 looks
uniform...
b) I don't understand the role of y in the Python lines like:
correct_logprobs = -np.log(probs[range(num_examples),y])
y doesn't seem to be defined. Your "equivalent" is
correct_logprobs =. -^.classes}|:prob
Does indexing by "classes" (all 3s) have the same effect as their y ?
Enough for now!
Mike
On 15/05/2019 21:51, 'Mike Day' via Programming wrote:
I've had a look at your example and the source you cite. You differ
from the source in seeming to
need explicit handling of hidden layer with both W & b AND W2 & b2
which I can't understand right now.
Ah - I've just found a second listing, lower down the page, which
does have W2 and b2 and a hidden layer!
I found, at least in Windows 10, that 'dot'plot.... shows more or
less white space; 'symbol' plot is better.
Anyway, when I first ran train, I got:
train 100
|nonce error: train
| dhidden=.0 indx}dhidden
The trouble arose from this triplet of lines:
dhidden =. dscores dot |:W2
indx =. I. hidden_layer <: 0
dhidden =. 0 indx}dhidden
Since you seem to be restricting dhidden to be non-negative, I
replaced these three with:
dhidden =. 0 >. dscores dot |:W2 NB. is this what you meant?
I've also changed the loop so that we get a report for the first
cycle, as in Python:
for_i. i. >: y do.
and added this line after smoutput i,loss - might not be necessary in
Darwin...
wd'msgs'
With these changes, train ran as follows:
cc =: train 10000 NB. loss starts ok, increases slightly, still
unlike the Python ex!
0 1.09856
1000 1.10522
2000 1.10218
3000 1.0997
4000 1.09887
5000 1.09867
6000 1.09862
7000 1.09861
8000 1.09861
9000 1.09861
$h_l =. 0>.(>1{cc) +"1 X dot >0{cc
300 100
$sc =. (>3{cc) +"1 h_l dot >2{cc
300 3
$predicted_class =. (i.>./)"1 sc
300
mean predicted_class = classes
0.333333
Why are the cycle 0 losses different, if only slightly? They report
1.098744 cf your 1.09856 .
Sorry - only minor problems found - they don't explain why you don't
reproduce their results
more closely,
Mike
---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm