BINGO!
I'd misread that hidden_layer line. Try this rather inefficient
get-around which does achive what they intend:
dhidden =. (hidden_layer >0) * dscores dot |:W2
and then, Hey Presto!
cc =: train 10000
0 1.09876
1000 0.556147
2000 0.2535
3000 0.240415
4000 0.238063
5000 0.237042
6000 0.236117
7000 0.235589
8000 0.23537
9000 0.235135
10000 0.235025
$h_l =. 0>.(>1{cc) +"1 X dot >0{cc
300 100
$sc =. (>3{cc) +"1 h_l dot >2{cc
300 3
$predicted_class =. (i.>./)"1 sc
300
mean predicted_class = classes
0.996667
I'm using rand1 =: 0.01 * rnorm, but that might not matter much...
And so to bed,
Mike
On 15/05/2019 23:22, Mike Day wrote:
Raul's point (just in - not copied here!) about capital Y is useful.
Pity they didn't use that trick!
NB - Don't forget that missing reg * W2*W2 in your loss calculation -
though it doesn't seem
to change the results so far! It didn't appear in their first set of
code, where they didn't have
W2 and b2 .
M
On 15/05/2019 23:11, Brian Schott wrote:
Mike,
I agree that 'symbol' works better than 'dot' in plot.
I can't think why you got the nonce error. But I was trying to reproduce
the following sequence from python. This sequence surprised me a
little as
I had originally used something more like yours.
dhidden = np.dot(dscores, W2.T)
# backprop the ReLU non-linearity
dhidden[hidden_layer <= 0] = 0
I have read your follow-on messages and they reminded me that I renamed
their y as classes because j verbs don't like the name y for
non-arguments.
Any yes classes are 0s 1s and 2s.
They are using normal variates with µ=0 and variance = 1, so I
attempted to
supply "uniform" variates with similar mean and variance, but of
course not
exactly the same relative frequencies. Uniform variates have variance
=(b-a)^2/12, so (1-_1)^2/12 = 4/12 = 1/3 so I multiplied each uniform
variate by %:3 to adjust. I hope that makes sense and hope it does not
produce such a great difference, but I suppose I can get normal variates
and experiment. Good idea.
Wrt your question "Does indexing by "classes" (all 3s) have the same
effect
as their y ?", I hope so.
Thanks very much,
On Wed, May 15, 2019 at 4:51 PM 'Mike Day' via Programming <
[email protected]> wrote:
I've had a look at your example and the source you cite. You differ
from the source in seeming to
need explicit handling of hidden layer with both W & b AND W2 & b2
which
I can't understand right now.
Ah - I've just found a second listing, lower down the page, which does
have W2 and b2 and a hidden layer!
I found, at least in Windows 10, that 'dot'plot.... shows more or less
white space; 'symbol' plot is better.
Anyway, when I first ran train, I got:
train 100
|nonce error: train
| dhidden=.0 indx}dhidden
The trouble arose from this triplet of lines:
dhidden =. dscores dot |:W2
indx =. I. hidden_layer <: 0
dhidden =. 0 indx}dhidden
Since you seem to be restricting dhidden to be non-negative, I replaced
these three with:
dhidden =. 0 >. dscores dot |:W2 NB. is this what you meant?
I've also changed the loop so that we get a report for the first cycle,
as in Python:
for_i. i. >: y do.
and added this line after smoutput i,loss - might not be necessary in
Darwin...
wd'msgs'
With these changes, train ran as follows:
cc =: train 10000 NB. loss starts ok, increases slightly, still
unlike the Python ex!
0 1.09856
1000 1.10522
2000 1.10218
3000 1.0997
4000 1.09887
5000 1.09867
6000 1.09862
7000 1.09861
8000 1.09861
9000 1.09861
$h_l =. 0>.(>1{cc) +"1 X dot >0{cc
300 100
$sc =. (>3{cc) +"1 h_l dot >2{cc
300 3
$predicted_class =. (i.>./)"1 sc
300
mean predicted_class = classes
0.333333
Why are the cycle 0 losses different, if only slightly? They report
1.098744 cf your 1.09856 .
Sorry - only minor problems found - they don't explain why you don't
reproduce their results
more closely,
Mike
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm