[Jprogramming] Quadratic programming solvers in J
Has anybody written a quadratic optimization solver in J? Or is there one in any of the packages? Examples: https://en.m.wikipedia.org/wiki/Quadratic_programming -- For information about J forums see http://www.jsoftware.com/forums.htm
Re: [Jprogramming] A.
A. 0 1 5 is the same as A. 2 3 4 0 1 5 so the "missing" items seem to be implicitly placed before the specified items, in order. Sent from Outlook Mobile On Sat, Apr 30, 2016 at 8:32 AM -0700, "'Pascal Jasmin' via Programming" wrote: A. 1 3 5 36 A. 5 1 3 40 A. 1 5 3 37 A. 5 3 1 41 do these results "mean" anything? I'm not sure that A. is defined for "open lists", though this does give an answer. One meaning is 36 + A. (permutation of 0 1 2 in same sorted order as permutation of 1 3 5). Where does 36 come from? A. 0 1 5 300 where does 300 come from? A. 3 4 5 0 -- For information about J forums see http://www.jsoftware.com/forums.htm -- For information about J forums see http://www.jsoftware.com/forums.htm
Re: [Jprogramming] Niven's constant
Hi, I don't have my PC at hand now, but what is the problem with your calculation? I had never heard of Nivens constant, but looking at Wikipedia there is a formula for it containing the zeta function. Why not use this formula? Zeta 2 3 4 are well known constants, and you can calculate or hardcode the values for other even arguments, https://en.m.wikipedia.org/wiki/Particular_values_of_Riemann_zeta_function You might need to calculate bernoulli numbers. For odd values I think you need to calculate from scratch, but since zeta 14 is almost 1 you don't need to do too many calculations, just from 5 to 15, say. This is just thinking out loud. But my main point is you only probably need the first 15 or so zeta values before convergence is 'good enough'. Regards, Jon On Wed, Jul 6, 2016 at 12:12 AM +0900, "mikel paternain" wrote: De: mikel paternain Enviado: domingo, 3 de julio de 2016 12:07:37 Para: [email protected] Asunto: Niven's constant Hi, I need improve the code for calculating Niven's constant. Could you help me? A first test is (see "Computo ergo sum" section in JoJ): exp=:+/"1@=@q: max=:>./ a=.max@exp(2+1.1000) C=.(+/%#)a Thanks in advance, Mikel Paternain -- For information about J forums see http://www.jsoftware.com/forums.htm -- For information about J forums see http://www.jsoftware.com/forums.htm
Re: [Jprogramming] Greatest Increasing Subsequence
Yes, Raul is absolutely correct. And the flaw (in my solution, at least) is obvious now. I'll try for a correct solution again tomorrow. From: 'Mike Day' via Programming Sent: Saturday, September 3, 00:26 Subject: Re: [Jprogramming] Greatest Increasing Subsequence To: 'Mike Day' via Programming Same problem with my version, which was faster but equally wrong! Mike On 02/09/2016 14:57, Raul Miller wrote: > Actually, if we try your approach on -8 2 4 3 1 6 we get _8 _2 _1 > instead of _8 _4 _3 _1. > > Here's a brute force O(2^n) approach that I hacked up earlier - it > obviously only works on short lists: > > increasing=: (-: /:~)@#~"1 #:@i.@^~&2@# > longestinc=: ] #~ [: (#~ ([: (= >./) +/"1)) #:@I.@increasing > > We can do better, of course, but I have some other things I want to > deal with, first. > > Thanks, > --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus -- For information about J forums see http://www.jsoftware.com/forums.htm -- For information about J forums see http://www.jsoftware.com/forums.htm
Re: [Jprogramming] simplifying im2col
I had a go writing conv nets in J.
See
https://github.com/jonghough/jlearn/blob/master/adv/conv2d.ijs
This uses ;.3 to do the convolutions. Using a version of this , with a couple
of fixes, I managed to get 88% accuracy on the cifar-10 imageset. Took several
days to run, as my algorithms are not optimized in any way, and no gpu was
used.
If you look at the references in the above link, you may get some ideas.
the convolution verb is defined as:
cf=: 4 : 0
|:"2 |: +/ x filter&(convFunc"3 3);._3 y
)
Note that since the input is an batch of images, each 3-d (width, height,
channels), we are actually doing the whole forward pass over a 4d array, and
outputting another 4d array of different shape, depending on output channels,
filter width, and filter height.
Thanks,
Jon
On Saturday, April 13, 2019, 7:25:16 AM GMT+9, Brian Schott
wrote:
As near as I can tell im2col is a matlab utility used by convolutional
neural network developers to transform a 2D image into a column vector. It
relies on u;.3 's subarray skills to "slide" through a 2D image.
I am experimenting with very small 3by4 pseudo-images and have created the
code below for my application. I found it very annoying to create my verbs
to work with both 12 item lists and with tables built on 12 columns. I'm
wondering if there is a simpler way to do what I show below. My verb im2col
assume that a 2x2 "filter" is being slid around with a 1 unit "stride" and
no "zero padding" and at present I do not need more generality than that.
My sense is unnecessarily using boxing and I dislike the heavy use of the
rank conjunction. But I fear I am at the limit of my ability with this
complicated im2col and forcearray, so suggestions would be welcome.
(Btw, my righthand argument is just for development, not real.)
forcearray =: (($,)~_2{.1,$)
im2col =: (2 2$1 1 2 2) (|:"2@:(,;._3)"2) _4 ]\"1 ]
$,"1&>/"1<"2 im2col forcearray i.12
1 4 6
$,"1&>/"1<"2 im2col forcearray i.2 12
2 4 6
$,"1&>/"1<"2 im2col forcearray i.3 12 NB. etc
3 4 6
,"1&>/"1<"2 im2col forcearray i.2 12
0 1 2 4 5 6
1 2 3 5 6 7
4 5 6 8 9 10
5 6 7 9 10 11
12 13 14 16 17 18
13 14 15 17 18 19
16 17 18 20 21 22
17 18 19 21 22 23
,"1&>/"1<"2 im2col forcearray i. 12
0 1 2 4 5 6
1 2 3 5 6 7
4 5 6 8 9 10
5 6 7 9 10 11
im2col is described in the link.
http://cs231n.github.io/convolutional-networks/
Thanks,
--
(B=)
--
For information about J forums see http://www.jsoftware.com/forums.htm
--
For information about J forums see http://www.jsoftware.com/forums.htm
Re: [Jprogramming] convolutional neural network [was simplifying im2col]
n the long run to make the models more modular, so created he classes in adv/ folder, very loosely based on tensorflow and pytorch, which allows adding layers at will. I will need to relearn a lot about conv nets to figure out why I used 1 0 2 3 |:, because, to be honest, I can't remember. I do remember that doing back prop over these 4-D matrices is very tricky, and I remember drawing lots of pics to figure it out, and that is what I came up with. It may be worth trying your way, to see what happens. As you rightly mentioned, there are more efficient ways of calculating the convolutions, but as I said, I have not yet gone back to think about that. I should probably try to implement it as tensorflow or pytorch do, but I was more interested in understand the theory of how it works than making it run fast. Perhaps the next stage will be to optimize this, if I have time. One more thing about my Conv2D implementation. The stride of the filter kernel cannot be any old value. The convolution algorithm needs to be EXACT (see: https://arxiv.org/pdf/1603.07285.pdf (section 2.3)) If I rememeber correctly this means that the input size subtract kernel size must be a multiple of the stride. ie input size 7 kernel size 3 stride 2 7 -3 divides 2, so ok. This limits what possible values can go into each conv layer. Tensor flow allows non-EXACT convolutions, and it probablh wouldn't be too much bother to implement that (the back prop would need to be refactored mainly), but not really on my todo list. Regards, Jon On Thursday, April 18, 2019, 11:36:35 AM GMT+9, Brian Schott wrote: I have renamed this message because the topic has changed, but considered moving it to jchat as well. However I settled on jprogramming because there are definitely some j programming issues to discuss. Jon, Your script code is beautifully commented and very valuable, imho. The lack of an example has slowed down my study of the script, but now I have some questions and comments. I gather from your comments that the word tensor is used to designate a 4 dimensional array. That's new to me, but it is very logical. Your definition convFunc=: +/@:,@:* works very well. However, for some reason I wish I could think of a way to defined convFunc in terms of X=: dot=: +/ . * . The main insight I have gained from your code is that (x u;.+_3 y) can be used with x of shape 2 n where n>2 (and not just 2 2). This is great information. And that you built the convFunc directly into cf is also very enlightening. I have created a couple of examples of the use of your function `cf` to better understand how it works. [The data is borrowed from the fine example at http://cs231n.github.io/convolutional-networks/#conv . Beware that the dynamic example seen at the link changes everytime the page is refreshed, so you will not see the exact data I present, but the shapes of the data are constant.] Notice that in my first experiments both `filter` and the RHA of cf"3 are arrays and not tensors. Consequently(?) the result is an array, not a tensor, either. i=: _7]\".;._2 (0 : 0) 0 0 0 0 0 0 0 0 0 0 1 2 2 0 0 0 0 2 1 0 0 0 0 0 1 2 2 0 0 0 0 0 2 0 0 0 0 0 2 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 1 2 2 2 0 0 0 1 0 2 0 0 0 1 1 1 1 1 0 0 2 0 0 0 2 0 0 0 0 2 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 1 0 0 1 1 0 0 0 0 0 2 1 2 0 2 0 0 1 0 0 2 2 0 0 1 0 1 2 2 0 0 0 0 0 0 0 0 ) k =: _3]\".;._2(0 :0) 1 0 0 1 _1 0 _1 _1 1 0 _1 1 0 0 1 0 _1 1 1 0 1 0 _1 0 0 _1 0 ) $i NB. 3 7 7 $k NB. 3 3 3 filter =: k convFunc=: +/@:,@:* cf=: 4 : '|:"2 |: +/ x filter&(convFunc"3 3);._3 y' (1 2 2,:3 3 3) cf"3 i NB. 3 3$1 1 _2 _2 3 _7 _3 1 0 My next example makes both the `filter` and the RHA into tensors. And notice the shape of the result shows it is a tensor, also. filter2 =: filter,:_1+filter cf2=: 4 : '|:"2 |: +/ x filter2&(convFunc"3 3);._3 y' $ (1 2 2,:3 3 3) cf2"3 i,:5+i NB. 2 2 3 3 Much of my effort regarding CNN has been studying the literature that discusses efficient ways of computing these convolutions by translating the filters and the image data into flattened (and somewhat sparse} forms that can be restated in matrix formats. These matrices accomplish the convolution and deconvolution as *efficient* matrix products. Your demonstration of the way that j's ;._3 can be so effective challenges the need for such efficiencies. On the other hand, I could use some help understanding how the 1 0 2 3 |: transpose you apply to `filter` is effective in the backpropogation stage. Part of my confusion is that I would have thought the transpose would have been 0 1 3 2 |:, instead. Can you say more about that? I have yet to try to understand your verbs `forward` and `backward`, but I look forward to doing so. I could not find definitions for the following functions a
Re: [Jprogramming] convolutional neural network [was simplifying im2col]
onsequently(?) the result is an array, not a tensor, either. i=: _7]\".;._2 (0 : 0) 0 0 0 0 0 0 0 0 0 0 1 2 2 0 0 0 0 2 1 0 0 0 0 0 1 2 2 0 0 0 0 0 2 0 0 0 0 0 2 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 1 2 2 2 0 0 0 1 0 2 0 0 0 1 1 1 1 1 0 0 2 0 0 0 2 0 0 0 0 2 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 1 0 0 1 1 0 0 0 0 0 2 1 2 0 2 0 0 1 0 0 2 2 0 0 1 0 1 2 2 0 0 0 0 0 0 0 0 ) k =: _3]\".;._2(0 :0) 1 0 0 1 _1 0 _1 _1 1 0 _1 1 0 0 1 0 _1 1 1 0 1 0 _1 0 0 _1 0 ) $i NB. 3 7 7 $k NB. 3 3 3 filter =: k convFunc=: +/@:,@:* cf=: 4 : '|:"2 |: +/ x filter&(convFunc"3 3);._3 y' (1 2 2,:3 3 3) cf"3 i NB. 3 3$1 1 _2 _2 3 _7 _3 1 0 My next example makes both the `filter` and the RHA into tensors. And notice the shape of the result shows it is a tensor, also. filter2 =: filter,:_1+filter cf2=: 4 : '|:"2 |: +/ x filter2&(convFunc"3 3);._3 y' $ (1 2 2,:3 3 3) cf2"3 i,:5+i NB. 2 2 3 3 Much of my effort regarding CNN has been studying the literature that discusses efficient ways of computing these convolutions by translating the filters and the image data into flattened (and somewhat sparse} forms that can be restated in matrix formats. These matrices accomplish the convolution and deconvolution as *efficient* matrix products. Your demonstration of the way that j's ;._3 can be so effective challenges the need for such efficiencies. On the other hand, I could use some help understanding how the 1 0 2 3 |: transpose you apply to `filter` is effective in the backpropogation stage. Part of my confusion is that I would have thought the transpose would have been 0 1 3 2 |:, instead. Can you say more about that? I have yet to try to understand your verbs `forward` and `backward`, but I look forward to doing so. I could not find definitions for the following functions and wonder if you can say more about them, please? bmt_jLearnUtil_ setSolver I noticed that your definitions of relu and derivRelu were more complicated than mine, so I attempted to test yours out against mine as follows. relu =: 0&>. derivRelu =: 0&< (relu -: 0:`[@.>&0) i: 4 1 (derivRelu -: 0:`1:@.>&0) i: 4 1 On Sun, Apr 14, 2019 at 8:31 AM jonghough via Programming < [email protected]> wrote: > I had a go writing conv nets in J. > See > https://github.com/jonghough/jlearn/blob/master/adv/conv2d.ijs > > This uses ;.3 to do the convolutions. Using a version of this , with a > couple of fixes/, I managed to get 88% accuracy on the cifar-10 imageset. > Took several days to run, as my algorithms are not optimized in any way, > and no gpu was used. > If you look at the references in the above link, you may get some ideas. > > the convolution verb is defined as: > cf=: 4 : 0 > |:"2 |: +/ x filter&(convFunc"3 3);._3 y > ) > > Note that since the input is an batch of images, each 3-d (width, height, > channels), we are actually doing the whole forward pass over a 4d array, > and outputting another 4d array of different shape, depending on output > channels, filter width, and filter height. > > Thanks, > Jon > Thank you, -- (B=) -- For information about J forums see http://www.jsoftware.com/forums.htm -- For information about J forums see http://www.jsoftware.com/forums.htm
Re: [Jprogramming] convolutional neural network [was simplifying im2col]
> > pipe=: (10;10;'softmax';1;'l2';0.0001) conew 'NNPipeline' > |assertion failure: create__w > | 4 =#shape I think this assertion error is in 'Conv2D' (conv2d.ijs). Not 'NNPipeline'. The shape (first constructor argument needs 4 numbers) e.g. c1=: ((10 3 4 4);2;'relu';'adam';0.01;0) conew 'Conv2D' the shape is 10 3 4 4. If you copy pasted my example I can't see how you got an assertion error there. Did you clone the whole repo? Regards, Jon On Friday, April 19, 2019, 2:33:04 AM GMT+9, Brian Schott wrote: Jon, I really appreciate your giving new examples. Before your reply I did not realize https://github.com/jonghough/jlearn/blob/master/adv/conv2d.ijs is part of the grander (duh!) https://github.com/jonghough/jlearn/ But now I am having trouble with your create verb and setting up so that it is executed in the right locale environment. When I thought I had that environment worked out, then I get assertion errors from the create such as the following: pipe=: (10;10;'softmax';1;'l2';0.0001) conew 'NNPipeline' |assertion failure: create__w | 4 =#shape I am pretty sure you are using different `create`s and are using them in unstated `cocurrent` environments. Would you mind providing the j environment at the start of this example? This most recent example with 5 3 8 8 shaped tensors is likely to be exactly what I am looking for if I can get it working. Thanks, much, On Thu, Apr 18, 2019 at 10:38 AM jonghough via Programming < [email protected]> wrote: > > Regarding the test network I sent in the previous email, it will not work. > This one should: > > [snip] -- (B=) -- For information about J forums see http://www.jsoftware.com/forums.htm -- For information about J forums see http://www.jsoftware.com/forums.htm
Re: [Jprogramming] convolutional neural network [was simplifying im2col]
The convolution kernel function is just a straight up elementwise multiply and then sum all, it is not a dot product or matrix product. Nice illustration is found here: https://mlnotebook.github.io/post/CNN1/ so +/@:,@:* works. I don't know if there is a faster way to do it. Thanks, Jon On Friday, April 19, 2019, 5:54:24 AM GMT+9, Raul Miller wrote: They're also not equivalent. For example: (i.2 3 4) +/@:,@:* i.2 3 970 (i.2 3 4) +/ .* i.2 3 |length error I haven't studied the possibilities of this code base enough to know how relevant this might be, but if you're working with rank 4 arrays, this kind of thing might matter. On the other hand, if the arrays handled by +/@:,@:* are the same shape, then +/ .*&, might be what you want. (Then again... any change introduced on "performance" grounds should get at least enough testing to show that there's a current machine where that change provides significant benefit for plausible data.) Thanks, -- Raul On Thu, Apr 18, 2019 at 4:16 PM Henry Rich wrote: > > FYI: +/@:*"1 and +/ . * are two ways of doing dot-products fast. > +/@:,@:* is not as fast. > > Henry Rich > > On 4/18/2019 10:38 AM, jonghough via Programming wrote: > > > > Regarding the test network I sent in the previous email, it will not work. > > This one should: > > > > NB. > > = > > > > > > NB. 3 classes > > NB. horizontal lines (A), vertical lines (B), diagonal lines (C). > > NB. each class is a 3 channel matrix 3x8x8 > > > > > > A1=: 3 8 8 $ 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 0 0 0 0 0 0 0 0, 1 1 1 1 1 1 > > 1 1, 0 0 0 0 0 0 0 0, 1 1 1 1 1 1 1 1, 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 0 > > 0 0 0 0 0 0 0 > > > > A2=: 3 8 8 $ 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 1 1 1 1 1 1 1 1, 1 1 1 1 1 1 > > 1 1, 0 0 0 0 0 0 0 0 > > > > A3=: 3 8 8 $ 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 0 0 0 0 0 0 0 0 > > > > A4=: 3 8 8 $ 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 0 0 0 0 0 0 0 0, 1 1 1 1 1 1 > > 1 1, 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0 > > > > A5=: 2 |. A4 > > > > > > > > B1=: |:"2 A1 > > > > B2=: |:"2 A2 > > > > B3=: |:"2 A3 > > > > B4=: |:"2 A4 > > > > B5=: |:"2 A5 > > > > > > > > C1=: 3 8 8 $ 1 0 0 0 0 0 0 1, 0 1 0 0 0 0 1 0, 0 0 1 0 0 1 0 0, 0 0 0 1 1 0 > > 0 0, 0 0 0 1 1 0 0 0, 0 0 1 0 0 1 0 0, 0 1 0 0 0 0 1 0, 1 0 0 0 0 0 0 1 > > > > C2=: 3 8 8 $ 1 0 0 0 0 0 0 0, 0 1 0 0 0 0 0 0, 0 0 1 0 0 0 0 0, 0 0 0 1 0 0 > > 0 0, 0 0 0 0 1 0 0 0, 0 0 0 0 0 1 0 0, 0 0 0 0 0 0 1 0, 0 0 0 0 0 0 0 1 > > > > C3=: 3 8 8 $ 1 0 1 0 1 0 0 0, 0 1 0 1 0 1 0 0, 0 0 1 0 1 0 1 0, 0 0 0 1 0 1 > > 0 1, 1 0 0 0 1 0 1 0, 0 1 0 0 0 1 0 1, 1 0 1 0 0 0 1 0, 0 1 0 1 0 0 0 1 > > > > C4=: |."1 C3 > > > > C5=: 3 8 8 $ 1 1 1 1 0 0 0 0, 0 0 1 1 1 1 0 0, 0 0 0 0 1 1 1 1, 1 1 0 0 0 0 > > 1 1, 1 1 1 1 0 0 0 0, 0 0 1 1 1 1 0 0, 0 0 0 0 1 1 1 1, 1 1 0 0 0 0 1 1 > > > > > > > > A=: 5 3 8 8 $, A1, A2, A3, A4, A5 > > > > B=: 5 3 8 8 $, B1, B2, B3, B4, B5 > > > > C=: 5 3 8 8 $, C1, C2, C3, C4, C5 > > > > INPUT=: A,B,C > > > > OUTPUT=: 15 3 $ 1 0 0, 1 0 0, 1 0 0, 1 0 0, 1 0 0, 0 1 0, 0 1 0, 0 1 0, 0 1 > > 0, 0 1 0, 0 0 1, 0 0 1, 0 0 1, 0 0 1, 0 0 1 > > > > > > > > pipe=: (10;10;'softmax';1;'l2';0.0001) conew 'NNPipeline' > > > > c1=: ((10 3 4 4);2;'relu';'adam';0.01;0) conew 'Conv2D' > > > > b1=: (0; 1 ;0.0001;10;0.01) conew 'BatchNorm2D' > > > > a1=: 'relu' conew 'Activation' > > > > > > > > c2=: ((12 10 2 2); 1;'relu';'adam';0.01;0) conew 'Conv2D' > > > > b2=: (0; 1 ;0.0001;5;0.01) conew 'BatchNorm2D' > > > > a2=: 'relu' conew 'Activation' > > > > p1=: 2 conew 'PoolLayer' > > > > > > > > fl=: 3 conew 'FlattenLayer' > > > > fc=: (12;3;'softmax';'adam';0.01) conew 'SimpleLayer' > > > > b3=: (0; 1 ;0.0001;2;0.01) conew 'BatchNorm' > > > > a3=: 'softmax' conew 'Activation' > > > > > > > > addLayer__pipe c1 > > > > addLayer__pipe p1 > > > > NB.addLayer__pipe b1 > > > > addLayer__pipe a1 > > > > addLayer__pipe c2 > > > > NB.addLayer__pipe b2 > > > > addLayer__pipe a2 > > > > ad
Re: [Jprogramming] convolutional neural network [was simplifying im2col]
Sorry, as I said in a previous email, the example I gave with runConv will not
work, as it was made for a much older version of the project. Please try this
as is, in jqt.
NB. ==
A1=: 3 8 8 $ 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 0 0 0 0 0 0 0 0, 1 1 1 1 1 1 1
1, 0 0 0 0 0 0 0 0, 1 1 1 1 1 1 1 1, 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 0 0 0 0
0 0 0 0
A2=: 3 8 8 $ 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 1 1 1 1 1 1 1 1, 1 1 1 1 1 1 1
1, 0 0 0 0 0 0 0 0
A3=: 3 8 8 $ 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 0 0 0 0 0 0 0 0
A4=: 3 8 8 $ 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 0 0 0 0 0 0 0 0, 1 1 1 1 1 1 1
1, 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0
A5=: 2 |. A4
B1=: |:"2 A1
B2=: |:"2 A2
B3=: |:"2 A3
B4=: |:"2 A4
B5=: |:"2 A5
C1=: 3 8 8 $ 1 0 0 0 0 0 0 1, 0 1 0 0 0 0 1 0, 0 0 1 0 0 1 0 0, 0 0 0 1 1 0 0
0, 0 0 0 1 1 0 0 0, 0 0 1 0 0 1 0 0, 0 1 0 0 0 0 1 0, 1 0 0 0 0 0 0 1
C2=: 3 8 8 $ 1 0 0 0 0 0 0 0, 0 1 0 0 0 0 0 0, 0 0 1 0 0 0 0 0, 0 0 0 1 0 0 0
0, 0 0 0 0 1 0 0 0, 0 0 0 0 0 1 0 0, 0 0 0 0 0 0 1 0, 0 0 0 0 0 0 0 1
C3=: 3 8 8 $ 1 0 1 0 1 0 0 0, 0 1 0 1 0 1 0 0, 0 0 1 0 1 0 1 0, 0 0 0 1 0 1 0
1, 1 0 0 0 1 0 1 0, 0 1 0 0 0 1 0 1, 1 0 1 0 0 0 1 0, 0 1 0 1 0 0 0 1
C4=: |."1 C3
C5=: 3 8 8 $ 1 1 1 1 0 0 0 0, 0 0 1 1 1 1 0 0, 0 0 0 0 1 1 1 1, 1 1 0 0 0 0 1
1, 1 1 1 1 0 0 0 0, 0 0 1 1 1 1 0 0, 0 0 0 0 1 1 1 1, 1 1 0 0 0 0 1 1
A=: 5 3 8 8 $, A1, A2, A3, A4, A5
B=: 5 3 8 8 $, B1, B2, B3, B4, B5
C=: 5 3 8 8 $, C1, C2, C3, C4, C5
INPUT=: A,B,C
OUTPUT=: 15 3 $ 1 0 0, 1 0 0, 1 0 0, 1 0 0, 1 0 0, 0 1 0, 0 1 0, 0 1 0, 0 1 0,
0 1 0, 0 0 1, 0 0 1, 0 0 1, 0 0 1, 0 0 1
pipe=: (10;10;'softmax';1;'l2';0.0001) conew 'NNPipeline'
c1=: ((10 3 4 4);2;'relu';'adam';0.01;0) conew 'Conv2D'
b1=: (0; 1 ;0.0001;10;0.01) conew 'BatchNorm2D'
a1=: 'relu' conew 'Activation'
c2=: ((12 10 2 2); 1;'relu';'adam';0.01;0) conew 'Conv2D'
b2=: (0; 1 ;0.0001;5;0.01) conew 'BatchNorm2D'
a2=: 'relu' conew 'Activation'
p1=: 2 conew 'PoolLayer'
fl=: 3 conew 'FlattenLayer'
fc=: (12;3;'softmax';'adam';0.01) conew 'SimpleLayer'
b3=: (0; 1 ;0.0001;2;0.01) conew 'BatchNorm'
a3=: 'softmax' conew 'Activation'
addLayer__pipe c1
addLayer__pipe p1
NB.addLayer__pipe b1
addLayer__pipe a1
addLayer__pipe c2
NB.addLayer__pipe b2
addLayer__pipe a2
addLayer__pipe fl
addLayer__pipe fc
NB.addLayer__pipe b3
addLayer__pipe a3
NB. OUTPUT fit__pipe INPUT NB. <- run this. maybe 3 or 4 times
NB. OUTPUT -:"1 1 (=>./)"1 >{: predict__pipe INPUT NB. <- then run this to
check accuracy
NB. ==
I did all my work using jQT, not jconsole.
My recommendation is:
1. open jqt and locate the project (hopefully you put it in projects folder)
2. load init.ijs (which will load everything)
3. put the above code into a J temp file.
4. run the temp file (and remember to run this line: OUTPUT fit__pipe INPUT )
On Friday, April 19, 2019, 1:05:52 PM GMT+9, Brian Schott
wrote:
I have now cloned the repo and below I show my directory.
server:~ brian$ cd /Users/brian/j64-807-user/nn
server:nn brian$ pwd
/Users/brian/j64-807-user/nn
server:nn brian$ ls /Users/brian/j64-807-user/nn/jlearn-master
LICENSE init.ijs optimize
README.md iris_som_all.png plot
adv iris_som_clusters.png rbf
clustering iris_som_planes.png run.ijs
datasets iris_som_umatrix.png score
densityestimation jlearn.gif serialize
dr jlearn.jproj solverexamples
energy kdtree som
ensemble knn test
genetic linear text
gp mixtures trees
impute mlp utils
server:nn brian$
Below is what I copied into jconsole.
cocurrent 'NN'
coclass 'Conv2D'
coinsert 'NNLayer'
cocurrent'NNPipeline'
create=: 3 : 0
if. a: -: y do.
bias=: ''
''
else.
'shape stride activation solverType alpha clampFlg'=: y
assert. 4 = # shape NB. wxhxdinxdout
assert. 1 <: stride
assert. alpha > 0
ks=: 2 3 $ (3 # stride) ,}.shape NB. kernel shape
filter=: activation createRandomWeightsNormal shape
reordered=: 1 0 2 3 |: filter
setActivationFunctions activation
solver=: ( wrote:
> >
> > pipe=: (10;10;'softmax';1;'l2';0.0001) conew 'NNPipeline'
> > |assertion failure: create__w
> > | 4 =#shape
> I think this assertion error is in 'Conv2D' (conv2d.ijs). Not
> 'NNPipeline'. The shape (first constructor argument needs 4 numbers)
> e.g.
> c1=: ((10 3 4 4);2;'relu';'adam';0.01;0) conew 'Conv2D'
>
> the shape is 10 3 4 4.
>
> If you copy pasted my example I can't see how you got an assertion error
> there. Did you clone the whole repo?
>
> Regards,
>
>
--
For information about J forums see http://www.jsoftware.com/forums.htm
--
For information about J forums see http://www.jsoftware.com/forums.htm
Re: [Jprogramming] convolutional neural network [was simplifying im2col]
Even though I have used J on and off for a few eyars now, I have not really
used jConsole at all. But I did just try it:
**
shell-promtp> ./jconsole
load jpath '~Projects/jlearn/init.ijs'
...
load jpath '~temp/simple_conv_test.ijs'
OUTPUT fit__pipe INPUT
**
The above should work.
My project is in Projects/jlearn
My test script it in temp/simple_conv_test.ijs
You may have you repository clone in some other location, so you will have to
find it when you load init.ijs.
On Friday, April 19, 2019, 1:25:03 PM GMT+9, jonghough via Programming
wrote:
Sorry, as I said in a previous email, the example I gave with runConv will
not work, as it was made for a much older version of the project. Please try
this as is, in jqt.
NB. ==
A1=: 3 8 8 $ 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 0 0 0 0 0 0 0 0, 1 1 1 1 1 1 1
1, 0 0 0 0 0 0 0 0, 1 1 1 1 1 1 1 1, 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 0 0 0 0
0 0 0 0
A2=: 3 8 8 $ 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 1 1 1 1 1 1 1 1, 1 1 1 1 1 1 1
1, 0 0 0 0 0 0 0 0
A3=: 3 8 8 $ 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 0 0 0 0 0 0 0 0
A4=: 3 8 8 $ 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 0 0 0 0 0 0 0 0, 1 1 1 1 1 1 1
1, 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0
A5=: 2 |. A4
B1=: |:"2 A1
B2=: |:"2 A2
B3=: |:"2 A3
B4=: |:"2 A4
B5=: |:"2 A5
C1=: 3 8 8 $ 1 0 0 0 0 0 0 1, 0 1 0 0 0 0 1 0, 0 0 1 0 0 1 0 0, 0 0 0 1 1 0 0
0, 0 0 0 1 1 0 0 0, 0 0 1 0 0 1 0 0, 0 1 0 0 0 0 1 0, 1 0 0 0 0 0 0 1
C2=: 3 8 8 $ 1 0 0 0 0 0 0 0, 0 1 0 0 0 0 0 0, 0 0 1 0 0 0 0 0, 0 0 0 1 0 0 0
0, 0 0 0 0 1 0 0 0, 0 0 0 0 0 1 0 0, 0 0 0 0 0 0 1 0, 0 0 0 0 0 0 0 1
C3=: 3 8 8 $ 1 0 1 0 1 0 0 0, 0 1 0 1 0 1 0 0, 0 0 1 0 1 0 1 0, 0 0 0 1 0 1 0
1, 1 0 0 0 1 0 1 0, 0 1 0 0 0 1 0 1, 1 0 1 0 0 0 1 0, 0 1 0 1 0 0 0 1
C4=: |."1 C3
C5=: 3 8 8 $ 1 1 1 1 0 0 0 0, 0 0 1 1 1 1 0 0, 0 0 0 0 1 1 1 1, 1 1 0 0 0 0 1
1, 1 1 1 1 0 0 0 0, 0 0 1 1 1 1 0 0, 0 0 0 0 1 1 1 1, 1 1 0 0 0 0 1 1
A=: 5 3 8 8 $, A1, A2, A3, A4, A5
B=: 5 3 8 8 $, B1, B2, B3, B4, B5
C=: 5 3 8 8 $, C1, C2, C3, C4, C5
INPUT=: A,B,C
OUTPUT=: 15 3 $ 1 0 0, 1 0 0, 1 0 0, 1 0 0, 1 0 0, 0 1 0, 0 1 0, 0 1 0, 0 1 0,
0 1 0, 0 0 1, 0 0 1, 0 0 1, 0 0 1, 0 0 1
pipe=: (10;10;'softmax';1;'l2';0.0001) conew 'NNPipeline'
c1=: ((10 3 4 4);2;'relu';'adam';0.01;0) conew 'Conv2D'
b1=: (0; 1 ;0.0001;10;0.01) conew 'BatchNorm2D'
a1=: 'relu' conew 'Activation'
c2=: ((12 10 2 2); 1;'relu';'adam';0.01;0) conew 'Conv2D'
b2=: (0; 1 ;0.0001;5;0.01) conew 'BatchNorm2D'
a2=: 'relu' conew 'Activation'
p1=: 2 conew 'PoolLayer'
fl=: 3 conew 'FlattenLayer'
fc=: (12;3;'softmax';'adam';0.01) conew 'SimpleLayer'
b3=: (0; 1 ;0.0001;2;0.01) conew 'BatchNorm'
a3=: 'softmax' conew 'Activation'
addLayer__pipe c1
addLayer__pipe p1
NB.addLayer__pipe b1
addLayer__pipe a1
addLayer__pipe c2
NB.addLayer__pipe b2
addLayer__pipe a2
addLayer__pipe fl
addLayer__pipe fc
NB.addLayer__pipe b3
addLayer__pipe a3
NB. OUTPUT fit__pipe INPUT NB. <- run this. maybe 3 or 4 times
NB. OUTPUT -:"1 1 (=>./)"1 >{: predict__pipe INPUT NB. <- then run this to
check accuracy
NB. ==
I did all my work using jQT, not jconsole.
My recommendation is:
1. open jqt and locate the project (hopefully you put it in projects folder)
2. load init.ijs (which will load everything)
3. put the above code into a J temp file.
4. run the temp file (and remember to run this line: OUTPUT fit__pipe INPUT )
On Friday, April 19, 2019, 1:05:52 PM GMT+9, Brian Schott
wrote:
I have now cloned the repo and below I show my directory.
server:~ brian$ cd /Users/brian/j64-807-user/nn
server:nn brian$ pwd
/Users/brian/j64-807-user/nn
server:nn brian$ ls /Users/brian/j64-807-user/nn/jlearn-master
LICENSE init.ijs optimize
README.md iris_som_all.png plot
adv iris_som_clusters.png rbf
clustering iris_som_planes.png run.ijs
datasets iris_som_umatrix.png score
densityestimation jlearn.gif serialize
dr jlearn.jproj solverexamples
energy kdtree som
ensemble knn test
genetic linear text
gp mixtures trees
impute mlp utils
server:nn brian$
Below is what I copied into jconsole.
cocurrent 'NN'
coclass 'Conv2D'
coinsert 'NNLayer'
cocurrent'NNPipeline'
create=: 3 : 0
if. a: -: y do.
bias=: ''
''
else.
'shape stride activation solverType alpha clampFlg'=: y
assert. 4 = # shape NB. wxhxdinxdout
assert. 1 <: stride
assert. alpha > 0
ks=: 2 3 $ (3 # stride) ,}.shape NB. kernel shape
filter=: activation createRandomWeightsNormal shape
reordered=: 1 0 2 3 |: filter
setActivationFunctions activation
solv
Re: [Jprogramming] convolutional neural network [was simplifying im2col]
Please create `simple_conv_test.ijs` in your temp folder (or somehwhere else) and put the script that I sent today into it. I just tried it and after doing that there will be another issue. As I said, this was written in, and for, jqt, so I used wd to allow msg output while the operation is still running. In jconsole wd will not be found, so after loading your `simple_conv_test.ijs` please do this: wd_NN_=:] That is a simple hack to get around the fact that wd will not be found otherwise. Then run your conv net OUTPUT fit__pipe INPUT On Friday, April 19, 2019, 1:53:41 PM GMT+9, Brian Schott wrote: We're getting warmer in jconsole. But I cannot find `simple_conv_test.ijs` as you can see at the bottom. Btw, I never have used jQt for projects successfully, so I could not figure out how to put the clone into the projects there. load jpath '~Projects/jlearn/init.ijs' not found: /users/brian/j64-807-user/projects/jlearn/init.ijs |file name error: script | 0!:0 y[4!:55<'y' load'/Users/brian/j64-807-user/projects/jlearn/init.ijs' not found: /users/brian/j64-807/addons/tables/csv/csv.ijs |file name error: script | 0!:0 y[4!:55<'y' load'/Users/brian/j64-807-user/projects/jlearn/init.ijs' 1 Test success Simple GMM test, diagonal covariance ... load jpath '~temp/simple_conv_test.ijs' not found: /users/brian/j64-807-user/temp/simple_conv_test.ijs On Fri, Apr 19, 2019 at 12:35 AM jonghough via Programming < [email protected]> wrote: > Even though I have used J on and off for a few eyars now, I have not > really used jConsole at all. But I did just try it: > ** > shell-promtp> ./jconsole > > load jpath '~Projects/jlearn/init.ijs' > ... > load jpath '~temp/simple_conv_test.ijs' > > OUTPUT fit__pipe INPUT > > ** > > The above should work. > My project is in Projects/jlearn > My test script it in temp/simple_conv_test.ijs > > You may have you repository clone in some other location, so you will have > to find it when you load init.ijs. > On Friday, April 19, 2019, 1:25:03 PM GMT+9, jonghough via > Programming wrote: > > Sorry, as I said in a previous email, the example I gave with runConv > will not work, as it was made for a much older version of the project. > Please try this as is, in jqt. > > NB. == > A1=: 3 8 8 $ 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 0 0 0 0 0 0 0 0, 1 1 1 1 1 > 1 1 1, 0 0 0 0 0 0 0 0, 1 1 1 1 1 1 1 1, 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, > 0 0 0 0 0 0 0 0 > A2=: 3 8 8 $ 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 1 1 1 1 1 1 1 1, 1 1 1 1 1 > 1 1 1, 0 0 0 0 0 0 0 0 > A3=: 3 8 8 $ 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 0 0 0 0 0 0 0 0 > A4=: 3 8 8 $ 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 0 0 0 0 0 0 0 0, 1 1 1 1 1 > 1 1 1, 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0 > A5=: 2 |. A4 > > B1=: |:"2 A1 > B2=: |:"2 A2 > B3=: |:"2 A3 > B4=: |:"2 A4 > B5=: |:"2 A5 > > C1=: 3 8 8 $ 1 0 0 0 0 0 0 1, 0 1 0 0 0 0 1 0, 0 0 1 0 0 1 0 0, 0 0 0 1 1 > 0 0 0, 0 0 0 1 1 0 0 0, 0 0 1 0 0 1 0 0, 0 1 0 0 0 0 1 0, 1 0 0 0 0 0 0 1 > C2=: 3 8 8 $ 1 0 0 0 0 0 0 0, 0 1 0 0 0 0 0 0, 0 0 1 0 0 0 0 0, 0 0 0 1 0 > 0 0 0, 0 0 0 0 1 0 0 0, 0 0 0 0 0 1 0 0, 0 0 0 0 0 0 1 0, 0 0 0 0 0 0 0 1 > C3=: 3 8 8 $ 1 0 1 0 1 0 0 0, 0 1 0 1 0 1 0 0, 0 0 1 0 1 0 1 0, 0 0 0 1 0 > 1 0 1, 1 0 0 0 1 0 1 0, 0 1 0 0 0 1 0 1, 1 0 1 0 0 0 1 0, 0 1 0 1 0 0 0 1 > C4=: |."1 C3 > C5=: 3 8 8 $ 1 1 1 1 0 0 0 0, 0 0 1 1 1 1 0 0, 0 0 0 0 1 1 1 1, 1 1 0 0 0 > 0 1 1, 1 1 1 1 0 0 0 0, 0 0 1 1 1 1 0 0, 0 0 0 0 1 1 1 1, 1 1 0 0 0 0 1 1 > > A=: 5 3 8 8 $, A1, A2, A3, A4, A5 > B=: 5 3 8 8 $, B1, B2, B3, B4, B5 > C=: 5 3 8 8 $, C1, C2, C3, C4, C5 > INPUT=: A,B,C > OUTPUT=: 15 3 $ 1 0 0, 1 0 0, 1 0 0, 1 0 0, 1 0 0, 0 1 0, 0 1 0, 0 1 0, 0 > 1 0, 0 1 0, 0 0 1, 0 0 1, 0 0 1, 0 0 1, 0 0 1 > > pipe=: (10;10;'softmax';1;'l2';0.0001) conew 'NNPipeline' > c1=: ((10 3 4 4);2;'relu';'adam';0.01;0) conew 'Conv2D' > b1=: (0; 1 ;0.0001;10;0.01) conew 'BatchNorm2D' > a1=: 'relu' conew 'Activation' > > c2=: ((12 10 2 2); 1;'relu';'adam';0.01;0) conew 'Conv2D' > b2=: (0; 1 ;0.0001;5;0.01) conew 'BatchNorm2D' > a2=: 'relu' conew 'Activation' > p1=: 2 conew 'PoolLayer' > > fl=: 3 conew 'FlattenLayer' > fc=: (12;3;'softmax';'adam';0.01) conew 'SimpleLayer' > b3=: (0; 1 ;0.0001;2;0.01) conew 'BatchNorm' > a3=: 'softmax' conew 'Activation' > > addLayer__pipe c1 > addLayer__pip
Re: [Jprogramming] convolutional neural network [was simplifying im2col]
> so I should run `OUTPUT fit__pipe INPUT` 2 or 3 more times.
Yes, I think so. After 2 or three more times, you should get all correct. 100%
accuracy.
>What does the other output mean? For example what is alternating 1 and 2,
> what is 1...20, what is 10?
There are 15 images. When we constructed the 'NNPipeline' class, we chose a
batch size of 10,
and an epoch count of 10 (epoch is one complete iteration through the whole
dataset).
So 15 / 10 = 1.5, which we round up to 2. So one epoch needs 2 iterations to
complete (with some images getting an extra pass).
As we selected 10 epochs, the total number of iterations is 2 * 10 = 20. 1 and
2 are the iterations for the current epoch.
For these small cases, this information is hardly useful. But, for example,
cifar-10 has 50,000 images, and batch sizes of 100 (in the model I made),
so one epoch is 500 iterations. It helps to know how many iterations have been
completed so far.
It might help to play around with the parameters of 'NNPipeline' constructor.
the learning rate 0.001 could be increased, for instance. I don't know whether
that would improve anything, but could be interesting.
Also, assuming you can use plot.
plot _3 (+/ % #) \ lossKeep__pipe
will give you a rolling average (of 3) of the loss during training. This is not
accuracy, but cross entropy loss. It should be a downward curve.
On Friday, April 19, 2019, 2:18:52 PM GMT+9, Brian Schott
wrote:
Ok. This is what I got.
load jpath '~temp/simple_conv_test.ijs'
batchnorm
┌─┬─┬──┬──┬┐
│0│1│0.0001│10│0.01│
└─┴─┴──┴──┴┘
batchnorm
┌─┬─┬──┬─┬┐
│0│1│0.0001│5│0.01│
└─┴─┴──┴─┴┘
batchnorm
┌─┬─┬──┬─┬┐
│0│1│0.0001│2│0.01│
└─┴─┴──┴─┴┘
Added Conv2D. Network depth is 1.
Added PoolLayer. Network depth is 2.
Added Activation. Network depth is 3.
Added Conv2D. Network depth is 4.
Added Activation. Network depth is 5.
Added FlattenLayer. Network depth is 6.
Added SimpleLayer. Network depth is 7.
Added Activation. Network depth is 8.
OUTPUT fit__pipe INPUT
Iteration complete: 1, total: 1
Iteration complete: 2, total: 2
Iteration complete: 1, total: 3
Iteration complete: 2, total: 4
Iteration complete: 1, total: 5
Iteration complete: 2, total: 6
Iteration complete: 1, total: 7
Iteration complete: 2, total: 8
Iteration complete: 1, total: 9
Iteration complete: 2, total: 10
Iteration complete: 1, total: 11
Iteration complete: 2, total: 12
Iteration complete: 1, total: 13
Iteration complete: 2, total: 14
Iteration complete: 1, total: 15
Iteration complete: 2, total: 16
Iteration complete: 1, total: 17
Iteration complete: 2, total: 18
Iteration complete: 1, total: 19
Iteration complete: 2, total: 20
10
OUTPUT -:"1 1 (=>./)"1 >{: predict__pipe INPUT
1 0 1 1 1 1 0 0 0 1 0 0 0 0 1
NB. so this tells me that 8 times the prediction was wrong and 7 times
right.
NB. so I should run `OUTPUT fit__pipe INPUT` 2 or 3 more times.
What does the other output mean? For example what is alternating 1 and 2,
what is 1...20, what is 10?
Thanks,
--
(B=)
--
For information about J forums see http://www.jsoftware.com/forums.htm
--
For information about J forums see http://www.jsoftware.com/forums.htm
Re: [Jprogramming] convolutional neural network [was simplifying im2col]
Hi Devon. Did you run the init.ijs script. If you run that initially, everything should be setup, and you should have no problems. On Friday, April 26, 2019, 9:54:55 AM GMT+9, Devon McCormick wrote: Hi - so I tried running the code at " https://github.com/jonghough/jlearn/blob/master/adv/conv2d.ijs"; but get numerous value errors. Is there another package somewhere that I need to load? Thanks, Devon On Fri, Apr 19, 2019 at 10:57 AM Raul Miller wrote: > That's the same thing as a dot product on ravels, unless the ranks of > your arguments are ever different. > > Thanks, > > -- > Raul > > On Thu, Apr 18, 2019 at 8:13 PM jonghough via Programming > wrote: > > > > The convolution kernel function is just a straight up elementwise > multiply and then sum all, it is not a dot product or matrix product. > > Nice illustration is found here: https://mlnotebook.github.io/post/CNN1/ > > > > so +/@:,@:* works. I don't know if there is a faster way to do it. > > > > Thanks, > > Jon On Friday, April 19, 2019, 5:54:24 AM GMT+9, Raul Miller < > [email protected]> wrote: > > > > They're also not equivalent. > > > > For example: > > > > (i.2 3 4) +/@:,@:* i.2 3 > > 970 > > (i.2 3 4) +/ .* i.2 3 > > |length error > > > > I haven't studied the possibilities of this code base enough to know > > how relevant this might be, but if you're working with rank 4 arrays, > > this kind of thing might matter. > > > > On the other hand, if the arrays handled by +/@:,@:* are the same > > shape, then +/ .*&, might be what you want. (Then again... any change > > introduced on "performance" grounds should get at least enough testing > > to show that there's a current machine where that change provides > > significant benefit for plausible data.) > > > > Thanks, > > > > -- > > Raul > > > > On Thu, Apr 18, 2019 at 4:16 PM Henry Rich wrote: > > > > > > FYI: +/@:*"1 and +/ . * are two ways of doing dot-products fast. > > > +/@:,@:* is not as fast. > > > > > > Henry Rich > > > > > > On 4/18/2019 10:38 AM, jonghough via Programming wrote: > > > > > > > > Regarding the test network I sent in the previous email, it will not > work. This one should: > > > > > > > > NB. > = > > > > > > > > > > > > NB. 3 classes > > > > NB. horizontal lines (A), vertical lines (B), diagonal lines (C). > > > > NB. each class is a 3 channel matrix 3x8x8 > > > > > > > > > > > > A1=: 3 8 8 $ 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 0 0 0 0 0 0 0 0, 1 1 > 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 1 1 1 1 1 1 1 1, 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 > 0 0, 0 0 0 0 0 0 0 0 > > > > > > > > A2=: 3 8 8 $ 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 1 1 1 1 1 1 1 1, 1 1 > 1 1 1 1 1 1, 0 0 0 0 0 0 0 0 > > > > > > > > A3=: 3 8 8 $ 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 0 0 0 0 0 0 0 0 > > > > > > > > A4=: 3 8 8 $ 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 0 0 0 0 0 0 0 0, 1 1 > 1 1 1 1 1 1, 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0 > > > > > > > > A5=: 2 |. A4 > > > > > > > > > > > > > > > > B1=: |:"2 A1 > > > > > > > > B2=: |:"2 A2 > > > > > > > > B3=: |:"2 A3 > > > > > > > > B4=: |:"2 A4 > > > > > > > > B5=: |:"2 A5 > > > > > > > > > > > > > > > > C1=: 3 8 8 $ 1 0 0 0 0 0 0 1, 0 1 0 0 0 0 1 0, 0 0 1 0 0 1 0 0, 0 0 > 0 1 1 0 0 0, 0 0 0 1 1 0 0 0, 0 0 1 0 0 1 0 0, 0 1 0 0 0 0 1 0, 1 0 0 0 0 0 > 0 1 > > > > > > > > C2=: 3 8 8 $ 1 0 0 0 0 0 0 0, 0 1 0 0 0 0 0 0, 0 0 1 0 0 0 0 0, 0 0 > 0 1 0 0 0 0, 0 0 0 0 1 0 0 0, 0 0 0 0 0 1 0 0, 0 0 0 0 0 0 1 0, 0 0 0 0 0 0 > 0 1 > > > > > > > > C3=: 3 8 8 $ 1 0 1 0 1 0 0 0, 0 1 0 1 0 1 0 0, 0 0 1 0 1 0 1 0, 0 0 > 0 1 0 1 0 1, 1 0 0 0 1 0 1 0, 0 1 0 0 0 1 0 1, 1 0 1 0 0 0 1 0, 0 1 0 1 0 0 > 0 1 > > > > > > > > C4=: |."1 C3 > > > > > > > > C5=: 3 8 8 $ 1 1 1 1 0 0 0 0, 0 0 1 1 1 1 0 0, 0 0 0 0 1 1 1 1, 1 1 > 0 0 0 0 1 1, 1 1 1 1 0 0 0 0, 0 0 1 1 1 1 0 0, 0 0 0 0 1 1 1 1, 1 1 0 0 0 0 > 1 1 > > > > > > > > > > > > > > > > A=: 5 3 8 8 $, A1, A2, A3, A4, A5 > > > > > > > > B=: 5 3 8 8 $
Re: [Jprogramming] convolutional neural network [was simplifying im2col]
The erorr is in the L-BFGS test, which is unrelated to conv nets. In fact just remove all tests from the init.ijs file and that error can be ignored. These lines at teh bottom of init.ijs NB. test require jpath '~Projects/jlearn/test/testbase.ijs' require jpath '~Projects/jlearn/test/testgmm.ijs' require jpath '~Projects/jlearn/test/testknn.ijs' require jpath '~Projects/jlearn/test/testnnopt.ijs' require jpath '~Projects/jlearn/test/testoptimize.ijs' require jpath '~Projects/jlearn/test/testrunner.ijs' can be deleted. On Friday, April 26, 2019, 10:39:32 PM GMT+9, Devon McCormick wrote: Hi Jon - I came up with a work-around by changing the "beta" parameter to 1e_7 instead of 0.0001 (in the call to "minBFGS" in "test4") but have no idea what this means in the larger scheme of things. Your helpful error message gave me the clue I needed: Error attempting line search. The input values to the function may be too extreme. Function input value (xk): _9.65977e27 _6.73606e6 Search direction value (pk): 2.06573e27 3.10187e6 A possible solution is to reduce the size of the initial inverse hessian scale, beta. Beta is currently set to 0.0001, which may be too large/small. |uncaught throw.: minBFGS_BFGS_ |minBFGS_BFGS_[0] Thanks again. I look forward to exploring this code. Regards, Devon On Fri, Apr 26, 2019 at 9:16 AM Devon McCormick wrote: > Hi Jon - > I got your example CIFAR-10 code running on one of my machines but got the > following error when running "init.ijs" on another one (perhaps with a > different version of J 8.07): > Test success AdaGrad Optimizer test 1 > 1 > Test success 1 > 1 > Test success 2 > 1 > Test success 3 > |value error: minBFGS_BFGS_ > | k=. u y > |assertThrow[2] > > It looks like "minBFGS_BFGS_" was not defined, so I pasted in the > definition before loading "init.ijs" and got a little further only to hit > this error: > Test success 3 > |NaN error: dot > |dot[:0] > 13!:1'' > |NaN error > *dot[:0] > | Hk=.(rhok*(|:sk)dot sk)+(I-rhok*(|:sk)dot yk)dot Hk > dot(I-rhok*(|:yk)dot sk) > |minBFGS_BFGS_[0] > | k=. u y > |assertThrow[2] > | ( minBFGS_BFGS_ assertThrow(f f.`'');(fp f.`'');(4 > 3);10;0.0001;0.0001) > |test4[2] > | res=. u'' > |testWrapper[0] > | test4 testWrapper 4 > |run__to[5] > | run__to'' > |[-180] > c:\users\devon_mccormick\j64-807-user\projects\jlearn\test\testoptimize.ijs > | 0!:0 y[4!:55<'y' > |script[0] > |fn[0] > | fn fl > |load[:7] > | 0 load y > |load[0] > | load fls > |require[1] > | require jpath'~Projects/jlearn/test/testoptimize.ijs' > |[-39] c:\Users\devon_mccormick\j64-807-user\projects\jlearn\init.ijs > | 0!:0 y[4!:55<'y' > |script[0] > |fn[0] > | fn fl > |load[:7] > | 0 load y > |load[0] > | > load'c:\Users\devon_mccormick\j64-807-user\projects\jlearn\init.ijs' > > The arguments to "dot" appear to be extreme values: > (I-rhok*(|:yk)dot sk) > 1 0 > _ 0 > Hk > _ _4.62371e38 > _4.62371e38 2.76904e_78 > > Any idea what might cause this? > > Thanks, > > Devon > > > > > On Thu, Apr 25, 2019 at 9:55 PM Devon McCormick > wrote: > >> That looks like it did the trick - thanks! >> >> On Thu, Apr 25, 2019 at 9:23 PM jonghough via Programming < >> [email protected]> wrote: >> >>> Hi Devon. >>> Did you run the init.ijs script. If you run that initially, everything >>> should be setup, and you should have no problems. >>> On Friday, April 26, 2019, 9:54:55 AM GMT+9, Devon McCormick < >>> [email protected]> wrote: >>> >>> Hi - so I tried running the code at " >>> https://github.com/jonghough/jlearn/blob/master/adv/conv2d.ijs"; but get >>> numerous value errors. >>> >>> Is there another package somewhere that I need to load? >>> >>> Thanks, >>> >>> Devon >>> >>> On Fri, Apr 19, 2019 at 10:57 AM Raul Miller >>> wrote: >>> >>> > That's the same thing as a dot product on ravels, unless the ranks of >>> > your arguments are ever different. >>> > >>> > Thanks, >>> > >>> > -- >>> > Raul >>> > >>> > On Thu, Apr 18, 2019 at 8:13 PM jonghough via Programming >>> > wrote: >>> > > >>> > > The convolution kernel function is
Re: [Jprogramming] convolutional neural network [was simplifying im2col]
I think you may be right. Thanks for pointing this out. However, since my
networks mostly work, I am going to assume that having too many biases doesn't
negatively impact the results, except for adding "useless" calculations. If you
are correct, I should fix this.
I have edited the source on a new branch to only have a 2d shaped bias. (see:
https://github.com/jonghough/jlearn/blob/feature/conv2d_layer_fix/adv/conv2d.ijs)
This is not on the master branch, but on a new branch. I am not 100% convinced
this is correct, and so am going to think about it.
I did, however test it on the MNIST dataset and got about 90% accuracy on test
data, after 2 epochs (takes a couple of hours to run on a PC). MNIST data is
not particularly challenging though. Would test it on CIFAR-10 if I had somem
ore time, but don't at the moment.
The MNIST conv net is:
NB. =
PATHTOTRAIN=: '/path/on/my/pc/to/mnist/train/input'
PATHTOTEST=: '/path/on/my/pc/to/mnist/test/input'
PATHTOTRAINLABELS=:'/path/on/my/pc/to/mnist/train/labels'
PATHTOTESTLABELS=: '/path/on/my/pc/to/mnist/test/labels'
rf=: 1!:1
data=: a.i. toJ dltb , rf < PATHTOTRAIN
TRAININPUT =: 255 %~ [ 6 1 28 28 $, 16}. data
data=: a.i. toJ dltb , rf < PATHTOTEST
TESTINPUT =: 255 %~ [ 1 1 28 28 $, 16}. data
data=: a.i. toJ dltb , rf < PATHTOTRAINLABELS
TRAINLABELS =: 6 10 $ , #: 2^ 8}. data
data=: a.i. toJ dltb , rf < PATHTOTESTLABELS
TESTLABELS =: 1 10 $ , #: 2^ 8}. data
pipe=: (100;20;'softmax';1; 'l2';0.0001) conew 'NNPipeline'
c1=: ((50 1 4 4);3;'relu';'adam';0.01;0) conew 'Conv2D'
b1 =: (0; 1 ;1e_4;50;0.001) conew 'BatchNorm2D'
a1 =: 'relu' conew 'Activation'
c2=: ((64 50 5 5);4;'relu';'adam';0.01;0) conew 'Conv2D'
b2 =: (0; 1 ;1e_4;64;0.001) conew 'BatchNorm2D'
a2 =: 'relu' conew 'Activation'
p1=: 2 conew 'PoolLayer'
fl=: 1 conew 'FlattenLayer'
fc1=: (64;34;'tanh';'adam';0.01) conew 'SimpleLayer'
b3 =: (0; 1 ;1e_4;34;0.001) conew 'BatchNorm'
a3 =: 'tanh' conew 'Activation'
fc2=: (34;10;'softmax';'adam';0.01) conew 'SimpleLayer'
b4 =: (0; 1 ;1e_4;10;0.001) conew 'BatchNorm'
a4 =: 'softmax' conew 'Activation'
addLayer__pipe c1
addLayer__pipe b1
addLayer__pipe a1
addLayer__pipe c2
addLayer__pipe b2
addLayer__pipe a2
addLayer__pipe p1
addLayer__pipe fl
addLayer__pipe fc1
addLayer__pipe b3
addLayer__pipe a3
addLayer__pipe fc2
addLayer__pipe b4
addLayer__pipe a4
TRAINLABELS fit__pipe TRAININPUT
NB. f=: 3 : '+/ ((y + i. 100){TESTLABELS) -:"1 1 (=>./)"1 >{:predict__pipe
(y+i.100){TESTINPUT'
NB. run f"0[100*i.100 to run prediction on ALL test set (in batches of size
100. Avg the result to get accuracy.
NB. =
As I said, I am going to go back and look at my notes (don't have them at
hand). I am sure you are correct, but then, am not 100% convinced that my new
bias shape is correct. After thinking it through I will probably merge the fix.
About backprop for bias, I simply took the ntd (next layer training deltas) and
averaged them across the first dimension, and then
multiplied by learn rate, and subtracted from the current bias. This was, a
fudge from me. Why average? to make the shapes fit. Biases are shared between
neurons so it makes sense to average the deltas that the bias contributes to.
As I am sure you have noticed, the actual implementation of convnet backprop is
the trickiest part, and also the least written about. I have a copy of
Goodfellow and Bengio's Deep Learning book, which is mostly excellent, but even
that just skims over backprop for convnets, or gives it a very abstract
mathematical treatment, but the actual nitty gritty details are left to the
reader. So my own interpretation of the actual correct implementation may be
wrong in places (but then again, how wrong can it be, if it gets correct
answers?). On Sunday, April 28, 2019, 3:34:57 PM GMT+9, Brian Schott
wrote:
Jon,
I have been studying your simple_conv_test.ijs example and trying to
compare it to the *dynamic* example at
http://cs231n.github.io/convolutional-networks/#conv where only 2 biases
are used with their stride of 2 and 2 output kernels of shape 3x3. (I
believe they have 2 biases because of the 2 output kernels.) In contrast,
according to my manual reconstruction of your verb preRun in conv2d.ijs I
get a whopping 90 biases (a 10x3x3 array), one for each of the 10 output
kernels in each of its 3x3 positions on the 8x8 image.
My confusion is that based on the cs231n example, I would have guessed
that you would have had only 10 biases, not 90. Can you comment on that,
please?
[Of course in my example below, my `filter` values are ridiculous.
And I have not adjusted for epochs and batches.
But I hope the shape of `filter` and the stride of 2 and the ks are
consistent with your simple example.]
filter =: i. 10 3 4 4
ks =: 2 3$2 2 2 3 4 4
$A,B,C
15 3 8 8
cf=: [: |:"2 [: |: [: +/ ks filter&(convFunc"3 3);._3 ]
$n=: cf"3 A,B,C
15
Re: [Jprogramming] convolutional neural network [was simplifying im2col]
Yes wd was used as a way to write information to the jQT window while the training is in operation. Without it, I would have no way of knowing the status of the model being trained. I only considered jQT usage and not JHS or console, so this will not work for those. If you are using JHS/jConsole, I suggest commenting it out, or just redefining it wd=: ] Thanks, Jon On Monday, April 29, 2019, 3:10:14 AM GMT+9, Devon McCormick wrote: Hi - trying to run the "fit__pipe" function, I encountered a value error on this line: wd^:1 'msgs' so I commented it out under the assumption this is "wd" defined in JQt and this is some sort of progress message. Is this correct? Thanks, Devon On Sun, Apr 28, 2019 at 9:20 AM jonghough via Programming < [email protected]> wrote: > I think you may be right. Thanks for pointing this out. However, since my > networks mostly work, I am going to assume that having too many biases > doesn't negatively impact the results, except for adding "useless" > calculations. If you are correct, I should fix this. > > I have edited the source on a new branch to only have a 2d shaped bias. > (see: > https://github.com/jonghough/jlearn/blob/feature/conv2d_layer_fix/adv/conv2d.ijs > ) > This is not on the master branch, but on a new branch. I am not 100% > convinced this is correct, and so am going to think about it. > > I did, however test it on the MNIST dataset and got about 90% accuracy on > test data, after 2 epochs (takes a couple of hours to run on a PC). MNIST > data is not particularly challenging though. Would test it on CIFAR-10 if I > had somem ore time, but don't at the moment. > > The MNIST conv net is: > > NB. = > > > PATHTOTRAIN=: '/path/on/my/pc/to/mnist/train/input' > PATHTOTEST=: '/path/on/my/pc/to/mnist/test/input' > PATHTOTRAINLABELS=:'/path/on/my/pc/to/mnist/train/labels' > PATHTOTESTLABELS=: '/path/on/my/pc/to/mnist/test/labels' > rf=: 1!:1 > data=: a.i. toJ dltb , rf < PATHTOTRAIN > TRAININPUT =: 255 %~ [ 6 1 28 28 $, 16}. data > > data=: a.i. toJ dltb , rf < PATHTOTEST > TESTINPUT =: 255 %~ [ 1 1 28 28 $, 16}. data > > > data=: a.i. toJ dltb , rf < PATHTOTRAINLABELS > TRAINLABELS =: 6 10 $ , #: 2^ 8}. data > > data=: a.i. toJ dltb , rf < PATHTOTESTLABELS > TESTLABELS =: 1 10 $ , #: 2^ 8}. data > > pipe=: (100;20;'softmax';1; 'l2';0.0001) conew 'NNPipeline' > c1=: ((50 1 4 4);3;'relu';'adam';0.01;0) conew 'Conv2D' > b1 =: (0; 1 ;1e_4;50;0.001) conew 'BatchNorm2D' > a1 =: 'relu' conew 'Activation' > c2=: ((64 50 5 5);4;'relu';'adam';0.01;0) conew 'Conv2D' > b2 =: (0; 1 ;1e_4;64;0.001) conew 'BatchNorm2D' > a2 =: 'relu' conew 'Activation' > p1=: 2 conew 'PoolLayer' > fl=: 1 conew 'FlattenLayer' > fc1=: (64;34;'tanh';'adam';0.01) conew 'SimpleLayer' > b3 =: (0; 1 ;1e_4;34;0.001) conew 'BatchNorm' > a3 =: 'tanh' conew 'Activation' > fc2=: (34;10;'softmax';'adam';0.01) conew 'SimpleLayer' > b4 =: (0; 1 ;1e_4;10;0.001) conew 'BatchNorm' > a4 =: 'softmax' conew 'Activation' > > addLayer__pipe c1 > addLayer__pipe b1 > addLayer__pipe a1 > addLayer__pipe c2 > addLayer__pipe b2 > addLayer__pipe a2 > addLayer__pipe p1 > addLayer__pipe fl > addLayer__pipe fc1 > addLayer__pipe b3 > addLayer__pipe a3 > addLayer__pipe fc2 > addLayer__pipe b4 > addLayer__pipe a4 > > > > TRAINLABELS fit__pipe TRAININPUT > > NB. f=: 3 : '+/ ((y + i. 100){TESTLABELS) -:"1 1 (=>./)"1 >{:predict__pipe > (y+i.100){TESTINPUT' > NB. run f"0[100*i.100 to run prediction on ALL test set (in batches of > size 100. Avg the result to get accuracy. > NB. = > > As I said, I am going to go back and look at my notes (don't have them at > hand). I am sure you are correct, but then, am not 100% convinced that my > new bias shape is correct. After thinking it through I will probably merge > the fix. > > About backprop for bias, I simply took the ntd (next layer training > deltas) and averaged them across the first dimension, and then > multiplied by learn rate, and subtracted from the current bias. This was, > a fudge from me. Why average? to make the shapes fit. Biases are shared > between neurons so it makes sense to average the deltas that the bias > contributes to. As I am sure you
Re: [Jprogramming] convolutional neural network [was simplifying im2col]
The locales may be a bit confusing, and if they are slowing down the training,
then I will definitely rethink them. The main idea is that
every layer is its own object and conducts its own forward and backward passes
during training and prediction.
Every layer, including Conv2D, LSTM, SimpleLayer, Activation, PoolLayer etc, is
a child of 'NNLayer', which itself is a child of 'NN' , which contains some
verbs useful to all layers (NN and NNLayer are defined in advnn.ijs).
This OO style is based on how I would do it in python, which I am far more
comfortable with than J. The style is probably not very J-esque (if there is
such a thing), but the fact that neural nets have so many moving parts (with
lots of mutable state), means some sort of object oriented programming is
suitable.
My understanding of J's OO system mainly comes from here:
https://www.jsoftware.com/help/learning/25.htm (there is not much literature on
this subject on the Jsoftware website, AFAIK). So just using coinsert and
coclass I created object hierarchies. Any advice for improvements of the design
are welcome.
As for global variables, I am trying to think, but can't recall any particular
global variables being passed around.
Basically the 'NNPipeline' object contains an array of 'NNLayer's (i.e.
layers__pipe), which it just iterates over using forward and backward verbs,
passing the output of one layer to the input of the next. backward, iterates in
reverse, passing the gradient values from the n+1th layer to the nth layer.
Each layer updates its internal parameters (i.e. weights) with each backward
pass.
So,
L1=:0{layers__pipe
w__L1
Will give you the weights of the first layer, assuming it is a 'SimpleLayer'
(i.e. Fully Connected Layer).
Thanks,
Jon On Tuesday, April 30, 2019, 3:02:44 AM GMT+9, Devon McCormick
wrote:
Hi -
yes, I also have difficulty tracing through the very OOP-style of coding
though I have only attempted it manually so far. By my count, the example
model for the CIFAR10 problem creates 23 namespaces, each with three to
more than twenty values, most of them scalars. Also, this style of coding
appears to use globals to pass values, making it more difficult to figure
out the data flow.
My de-looping change decreased the execution time by such a large amount,
running hundreds of times faster, that I suspect it has to be wrong. Upon
reflection, I suspect that the use of namespaces and global scalars for
passing values essentially serializes the computation, so my attempt to
call the "fit" function with arrays fails to run all but one of the trials.
I intend to continue to work on this both because of my interest in
developing my own CNN for working with photos and for working out an
example of the difference between OOP and array programming.
Also, J's flaky handling of namespaces in debug does not help efforts to
understand this code at a high level.
Regards,
Devon
On Mon, Apr 29, 2019 at 1:16 PM Brian Schott wrote:
> Devon,
>
> I'm not sure what I would have called my difficulty with understanding
> Jon's fine jlearn system, but "de-looping" may be a good choice. To me the
> difficulty involves the heavy use of locales, with which I have little
> experience. This has meant that trying to trace verb calls and using debug
> have been difficult for me. For example when I try to use debug for a verb
> like foo__bar, the stop manager in jQt produces a paragraph of lines in the
> selected verb but not a list of lines. The paragraph does not permit stop
> line selection. I have not tried to use dbss to select stop lines, but
> maybe that is possible.
>
> I wonder if the complication locales inserts has been a reason that I have
> not been able to find a usable verb/adverb/conjunction cross-reference
> script for J.
> The jlearn system is really impressive, though. It's documentation is
> excellent inside most verbs.
>
>
>
> --
> (B=)
> --
> For information about J forums see http://www.jsoftware.com/forums.htm
>
--
Devon McCormick, CFA
Quantitative Consultant
--
For information about J forums see http://www.jsoftware.com/forums.htm
--
For information about J forums see http://www.jsoftware.com/forums.htm
Re: [Jprogramming] convolutional neural network [was simplifying im2col]
This looks very interesting. Sorry, I am traveling until next week so cannot
give it much more than a quick look through at the moment. Next week I will try
to run it.
By the way, following you advice and issues you discovered with my convnet
(bias shape in particular), I am refactoring my source code. I am struggling to
get much more than 65% accuracy on cifar-10... very irritating.
It looks like your backprop padding is much nicer than mine. Once I have a
chance to look it over properly I will try to integrate that into my source
code.
Thanks,
Jon
On Tuesday, May 21, 2019, 12:11:51 PM GMT+9, Brian Schott
wrote:
I have been developing a toy convolutional neural network patterned after
the toy non-convolutional nn discussed in this thread. The status of the
toy is produced below here for any comments.
NB. simpler_conv_test.ijs
NB. 5/19/19
NB. based on Jon Hough's simple_conv_test.ijs
NB. especially Jon's conv and convFunc in backprob
NB. and patterned after toy nn case study at
NB. http://cs231n.github.io/neural-networks-case-study/#net
NB. This example demonstrates a 3-layer neural net:
NB. an input layer receives 8x8 Black and White (binary)images,
NB. a hidden convolutional layer with 2 4x4 filters
NB. which stride 2 over the 8x8 input images with no zero-padding,
NB. and 3 classes in the softmax activated output layer.
NB. These exact features of the filter sizes and the image
NB. size and the stride and the lack of zero-padding,
NB. produce 9 readings by each filter.
NB. The names W and W2 for filters/weights are kept from the
NB. original toy case study which had only fully connected
NB. layers, and no convolutional layer.
NB. ==
NB. revised data to use only a single image channel instead of 3 channels
A1=:8
8$"."0''
A2=:8 8$"."0''
A3=:8 8$"."0''
A4=:8 8$"."0''
A5=: 2 |. A4
B1=: |:"2 A1
B2=: |:"2 A2
B3=: |:"2 A3
B4=: |:"2 A4
B5=: |:"2 A5
C1=:8
8$"."0'100101100010010110011010010001101001'
C2=:8
8$"."0'11111111'
C3=:8
8$"."0'1010110101101011010110001010010001011010001001010001'
C4=: |."1 C3
C5=:8
8$"."0'0000111100001111'
A=: 5 8 8 $, A1, A2, A3, A4, A5
B=: 5 8 8 $, B1, B2, B3, B4, B5
C=: 5 8 8 $, C1, C2, C3, C4, C5
X =: INPUT=: A,B,C
Y =: 5#0 1 2 NB. specific target values for this case
NB. some utility verbs
dot =: +/ . *
probs =: (%+/)@:^"1 NB. used in softmax activiation
amnd =: (1-~{)`[`]}
mean =: +/ % #
rand01 =: ?.@$ 0: NB. ? replaced with ?. for demo purposes only
normalrand =: (2 o. [: +: [: o. rand01) * [: %: [: - [: +: [: ^. rand01
NB. Hough's backprop magic verb
deconv =: 4 : '(1 1,:5 5) x&(conv"2 _) tesl y'
conv =: +/@:,@:* NB. Jon Hough's tessellation verb for conv
tesl =: ;._3
NB. backprop requires some zero-padding, thus msk
msk =: (1j1 1j1 1) NB. creates zero-padding for 3x3 array
NB. tessellation requires and produces rectangular data
NB. so collapse and sqr reworks and creates such data
collapse =: (,"2)&(1 2&|:) NB. list from internal array
sqr =: $&,~}:@$,2&$@:%:@{:@$ NB. square array from list
NB. some training parameters
step_size =: 1e_0
reg =: 1e_3
N =: 5 NB. number of points per class
D =: 2 NB. dimensionality
K =: 3 NB. number of classes
bias =: 0
train =: dyad define
'W b W2 b2' =. x NB. weights and biases
num_examples =. #X
for_i. 1+i. y do. NB. y is number of batches
hidden_layer =. 0>.b +"1(2 2,:4 4) W&(conv"2 _) tesl"3 2 X
c_hidden_layer =. collapse hidden_layer
scores =. mean 0>.b2 +"1 (1 0 2|:c_hidden_layer) dot"2 W2
prob =. probs scores
correct_logprobs =. -^.Y}|:prob
data_loss =. (+/correct_logprobs)%num_examples
reg_loss =. (0.5*reg*+/,W*W) + 0.5*reg*+/,W2*W2
loss =. data_loss + reg_loss
if. 0=(y%10)|i do.
smoutput i,loss
end.
dscores =. prob
dscores =. Y amnd"0 1 dscores
dscores =. dscores%num_examples
dW2 =. 1 0 2|:(|:collapse hidden_layer)dot dscores
db2 =. mean dscores
NB. Mike Day fixed next line 5/16/19
dhidden =. (0<|:"2 c_hidden_layer)*dscores dot|:W2
padded =. msk"1 msk"2 sqr |:"2 dhidden
dW =. mean 0 3 1 2|: padded deconv"3 2 X
db =. mean mean dhidden
dW2 =. dW2 + reg*W2
dW =. dW + reg*W
W =. W-step_size * dW
b =. b-step_size * db
W2 =. W2-step_size * dW2
b2 =. b2-step_size * db2
end.
W;b;W2;b2
)
Note 'example start of training'
NB. 2 4 4 is the shape of W.
NB. That shape reflects 2 filters of wxh = 4x4
NB. over the 8x8 image layer
NB. The hidden layer is convolutional; W defines the
NB. weights (filters) between input layer and hidden layer.
$xW =. 0.026*normalrand 2 4 4 NB. 0.026 is wild-ass guess
2 4
