I have been working on a convolutional neural net application and have
learned from Jon's conv2d system, especially about his verb `deconv` in the
backprop step. That was a real eye-opener for me. But, when I use the
softmax activation, I am still having trouble getting well trained nets --
even for NON-convolutional systems. So I have attempted to code a j version
of the TOY, standard python example at
http://cs231n.github.io/neural-networks-case-study/#net .

Below I have produced my script and a **sample run**. I would appreciate
any advice.

The disappointment comes when I can get only about 51% accurate prediction
(as I have documented at the bottom of this post) and yet the authors get
98% accuracy. I cannot see what is different between my code and the python
code, but there must be some difference.
Notice that I have used Roll with a fixed seed in both the verbs `rand1`
and `rand2` below, so likely you can get the same result that I get.
My plot of the data looks very similar to their plot.


load'trig'
load'stats'
load'plot'
thetas =: 0 4 8+/steps 0 4 99
radii =: 100%~steps 1 100 99
rand2 =: 1.73205*0.2 * _2 ([ * <:@+:@])/\ 0 2 ?.@$~ +:
rand1 =: 1.73205*0.01 * _2 ([ * <:@+:@])/\ 0 2 ?.@$~ +:
dot =: +/ . *
probs =: (%+/)@:^"1
amnd =: (1-~{)`[`]}
mean =: +/ % #

X =: |:,"2|:radii*|:(sin,:cos)(_100]\rand2 300)+thetas
classes =: 100#i. 3
NB. 'dot' plot ;/|:X

step_size =: 1e_0
reg =: 1e_3
N =: 100 NB. number of points per class
D =: 2 NB. dimensionality
K =: 3 NB. number of classes

train =: verb define
h =. 100 NB. size of hidden layer
W =. ($rand1@*/) D,h
b =. h#0
W2 =. ($rand1@*/) h,K
b2 =. K#0
(W;b;W2;b2) train y
:
'W b W2 b2' =. x
num_examples =. #X
for_i. 1+i. y do.
    hidden_layer =. 0>.b +"1 X dot W
    scores =. b2 +"1 hidden_layer dot W2
    NB. exp_scores =. ^scores

    prob =. probs scores
    correct_logprobs =. -^.classes}|:prob

    data_loss =. +/correct_logprobs%num_examples
    reg_loss =. 0.5*reg*+/,W*W
    loss =. data_loss + reg_loss
    if. 0=(y%10)|i do.
        smoutput i,loss
    end.
    dscores =. prob
    dscores =. classes amnd"0 1 dscores
    dscores =. dscores%num_examples

    dW2 =. (|:hidden_layer) dot dscores
    db2 =. +/dscores

    dhidden =. dscores dot |:W2
    indx =. I. hidden_layer <: 0
    dhidden =. 0 indx}dhidden

    dW =. (|:X) dot dhidden
    db =. +/dhidden

    dW2 =. dW2 + reg*W2
    dW =. dW + reg*W

    W =. W-step_size * dW
    b =. b-step_size * db

    W2 =. W2-step_size * dW2
    b2 =. b2-step_size * db2
end.
NB. (W;b);<W2;b2
W;b;W2;b2

)

Note 'sample results'
   cc =: train 10000
1000 8.40634
2000 4.71528
3000 5.8607
4000 11.6474
5000 10.9213
6000 4.65864
7000 2.1275
8000 1.44098
9000 1.28587
10000 1.25434
   $h_l =. 0>.(>1{cc) +"1 X dot >0{cc
300 100
   $sc =. (>3{cc) +"1 h_l dot >2{cc
300 3
   $predicted_class =. (i.>./)"1 sc
300
   mean predicted_class = classes
0.51
   JVERSION
Engine: j807/j64/darwin
Release-c: commercial/2019-02-24T10:50:40
Library: 8.07.25
Qt IDE: 1.7.9/5.9.7
Platform: Darwin 64
Installer: J807 install
InstallPath: /users/brian/j64-807
Contact: www.jsoftware.com
)


-- 
(B=)
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to