Re: [Jprogramming] convolutional neural network [was simplifying im2col]

Devon McCormick Fri, 26 Apr 2019 06:17:43 -0700

Hi Jon -
I got your example CIFAR-10 code running on one of my machines but got the
following error when running "init.ijs" on another one (perhaps with a
different version of J 8.07):
Test success AdaGrad Optimizer test 1
1
Test success 1
1
Test success 2
1
Test success 3
|value error: minBFGS_BFGS_
|   k=.    u y
|assertThrow[2]


It looks like "minBFGS_BFGS_" was not defined, so I pasted in the
definition before loading "init.ijs" and got a little further only to hit
this error:
Test success 3
|NaN error: dot
|dot[:0]
      13!:1''
|NaN error
*dot[:0]
|   Hk=.(rhok*(|:sk)dot sk)+(I-rhok*(|:sk)dot yk)dot Hk
 dot(I-rhok*(|:yk)dot sk)
|minBFGS_BFGS_[0]
|   k=.    u y
|assertThrow[2]
|   (    minBFGS_BFGS_ assertThrow(f f.`'');(fp f.`'');(4
3);100000;0.0001;0.0001)
|test4[2]
|   res=.    u''
|testWrapper[0]
|       test4 testWrapper 4
|run__to[5]
|       run__to''
|[-180]
c:\users\devon_mccormick\j64-807-user\projects\jlearn\test\testoptimize.ijs
|       0!:0 y[4!:55<'y'
|script[0]
|fn[0]
|       fn fl
|load[:7]
|   0     load y
|load[0]
|       load fls
|require[1]
|       require jpath'~Projects/jlearn/test/testoptimize.ijs'
|[-39] c:\Users\devon_mccormick\j64-807-user\projects\jlearn\init.ijs
|       0!:0 y[4!:55<'y'
|script[0]
|fn[0]
|       fn fl
|load[:7]
|   0     load y
|load[0]
|       load'c:\Users\devon_mccormick\j64-807-user\projects\jlearn\init.ijs'

The arguments to "dot" appear to be extreme values:
      (I-rhok*(|:yk)dot sk)
1 0
_ 0
      Hk
          _ _4.62371e38
_4.62371e38 2.76904e_78

Any idea what might cause this?

Thanks,

Devon




On Thu, Apr 25, 2019 at 9:55 PM Devon McCormick <[email protected]> wrote:

> That looks like it did the trick - thanks!
>
> On Thu, Apr 25, 2019 at 9:23 PM jonghough via Programming <
> [email protected]> wrote:
>
>>  Hi Devon.
>> Did you run the init.ijs script. If you run that initially, everything
>> should be setup, and you should have no problems.
>>      On Friday, April 26, 2019, 9:54:55 AM GMT+9, Devon McCormick <
>> [email protected]> wrote:
>>
>>  Hi - so I tried running the code at "
>> https://github.com/jonghough/jlearn/blob/master/adv/conv2d.ijs"; but get
>> numerous value errors.
>>
>> Is there another package somewhere that I need to load?
>>
>> Thanks,
>>
>> Devon
>>
>> On Fri, Apr 19, 2019 at 10:57 AM Raul Miller <[email protected]>
>> wrote:
>>
>> > That's the same thing as a dot product on ravels, unless the ranks of
>> > your arguments are ever different.
>> >
>> > Thanks,
>> >
>> > --
>> > Raul
>> >
>> > On Thu, Apr 18, 2019 at 8:13 PM jonghough via Programming
>> > <[email protected]> wrote:
>> > >
>> > >  The convolution kernel function is just a straight up elementwise
>> > multiply and then sum all, it is not a dot product or matrix product.
>> > > Nice illustration is found here:
>> https://mlnotebook.github.io/post/CNN1/
>> > >
>> > > so +/@:,@:* works. I don't know if there is a faster way to do it.
>> > >
>> > > Thanks,
>> > > Jon    On Friday, April 19, 2019, 5:54:24 AM GMT+9, Raul Miller <
>> > [email protected]> wrote:
>> > >
>> > >  They're also not equivalent.
>> > >
>> > > For example:
>> > >
>> > >  (i.2 3 4) +/@:,@:* i.2 3
>> > > 970
>> > >  (i.2 3 4) +/ .* i.2 3
>> > > |length error
>> > >
>> > > I haven't studied the possibilities of this code base enough to know
>> > > how relevant this might be, but if you're working with rank 4 arrays,
>> > > this kind of thing might matter.
>> > >
>> > > On the other hand, if the arrays handled by +/@:,@:* are the same
>> > > shape, then +/ .*&, might be what you want. (Then again... any change
>> > > introduced on "performance" grounds should get at least enough testing
>> > > to show that there's a current machine where that change provides
>> > > significant benefit for plausible data.)
>> > >
>> > > Thanks,
>> > >
>> > > --
>> > > Raul
>> > >
>> > > On Thu, Apr 18, 2019 at 4:16 PM Henry Rich <[email protected]>
>> wrote:
>> > > >
>> > > > FYI: +/@:*"1 and +/ . * are two ways of doing dot-products fast.
>> > > > +/@:,@:* is not as fast.
>> > > >
>> > > > Henry Rich
>> > > >
>> > > > On 4/18/2019 10:38 AM, jonghough via Programming wrote:
>> > > > >
>> > > > > Regarding the test network I sent in the previous email, it will
>> not
>> > work. This one should:
>> > > > >
>> > > > > NB.
>> >
>> =========================================================================
>> > > > >
>> > > > >
>> > > > > NB. 3 classes
>> > > > > NB. horizontal lines (A), vertical lines (B), diagonal lines (C).
>> > > > > NB. each class is a 3 channel matrix 3x8x8
>> > > > >
>> > > > >
>> > > > > A1=: 3 8 8 $ 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 0 0 0 0 0 0 0 0, 1
>> 1
>> > 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 1 1 1 1 1 1 1 1, 1 1 1 1 1 1 1 1, 0 0 0 0
>> 0 0
>> > 0 0, 0 0 0 0 0 0 0 0
>> > > > >
>> > > > > A2=: 3 8 8 $ 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 1 1 1 1 1 1 1 1, 1
>> 1
>> > 1 1 1 1 1 1, 0 0 0 0 0 0 0 0
>> > > > >
>> > > > > A3=: 3 8 8 $ 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 0 0 0 0 0 0 0 0
>> > > > >
>> > > > > A4=: 3 8 8 $ 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 0 0 0 0 0 0 0 0, 1
>> 1
>> > 1 1 1 1 1 1, 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0
>> > > > >
>> > > > > A5=: 2 |. A4
>> > > > >
>> > > > >
>> > > > >
>> > > > > B1=: |:"2 A1
>> > > > >
>> > > > > B2=: |:"2 A2
>> > > > >
>> > > > > B3=: |:"2 A3
>> > > > >
>> > > > > B4=: |:"2 A4
>> > > > >
>> > > > > B5=: |:"2 A5
>> > > > >
>> > > > >
>> > > > >
>> > > > > C1=: 3 8 8 $ 1 0 0 0 0 0 0 1, 0 1 0 0 0 0 1 0, 0 0 1 0 0 1 0 0, 0
>> 0
>> > 0 1 1 0 0 0, 0 0 0 1 1 0 0 0, 0 0 1 0 0 1 0 0, 0 1 0 0 0 0 1 0, 1 0 0 0
>> 0 0
>> > 0 1
>> > > > >
>> > > > > C2=: 3 8 8 $ 1 0 0 0 0 0 0 0, 0 1 0 0 0 0 0 0, 0 0 1 0 0 0 0 0, 0
>> 0
>> > 0 1 0 0 0 0, 0 0 0 0 1 0 0 0, 0 0 0 0 0 1 0 0, 0 0 0 0 0 0 1 0, 0 0 0 0
>> 0 0
>> > 0 1
>> > > > >
>> > > > > C3=: 3 8 8 $ 1 0 1 0 1 0 0 0, 0 1 0 1 0 1 0 0, 0 0 1 0 1 0 1 0, 0
>> 0
>> > 0 1 0 1 0 1, 1 0 0 0 1 0 1 0, 0 1 0 0 0 1 0 1, 1 0 1 0 0 0 1 0, 0 1 0 1
>> 0 0
>> > 0 1
>> > > > >
>> > > > > C4=: |."1 C3
>> > > > >
>> > > > > C5=: 3 8 8 $ 1 1 1 1 0 0 0 0, 0 0 1 1 1 1 0 0, 0 0 0 0 1 1 1 1, 1
>> 1
>> > 0 0 0 0 1 1, 1 1 1 1 0 0 0 0, 0 0 1 1 1 1 0 0, 0 0 0 0 1 1 1 1, 1 1 0 0
>> 0 0
>> > 1 1
>> > > > >
>> > > > >
>> > > > >
>> > > > > A=: 5 3 8 8 $, A1, A2, A3, A4, A5
>> > > > >
>> > > > > B=: 5 3 8 8 $, B1, B2, B3, B4, B5
>> > > > >
>> > > > > C=: 5 3 8 8 $, C1, C2, C3, C4, C5
>> > > > >
>> > > > > INPUT=: A,B,C
>> > > > >
>> > > > > OUTPUT=: 15 3 $ 1 0 0, 1 0 0, 1 0 0, 1 0 0, 1 0 0, 0 1 0, 0 1 0,
>> 0 1
>> > 0, 0 1 0, 0 1 0, 0 0 1, 0 0 1, 0 0 1, 0 0 1, 0 0 1
>> > > > >
>> > > > >
>> > > > >
>> > > > > pipe=: (10;10;'softmax';1;'l2';0.0001) conew 'NNPipeline'
>> > > > >
>> > > > > c1=: ((10 3 4 4);2;'relu';'adam';0.01;0) conew 'Conv2D'
>> > > > >
>> > > > > b1=: (0; 1 ;0.0001;10;0.01) conew 'BatchNorm2D'
>> > > > >
>> > > > > a1=: 'relu' conew 'Activation'
>> > > > >
>> > > > >
>> > > > >
>> > > > > c2=: ((12 10 2 2); 1;'relu';'adam';0.01;0) conew 'Conv2D'
>> > > > >
>> > > > > b2=: (0; 1 ;0.0001;5;0.01) conew 'BatchNorm2D'
>> > > > >
>> > > > > a2=: 'relu' conew 'Activation'
>> > > > >
>> > > > > p1=: 2 conew 'PoolLayer'
>> > > > >
>> > > > >
>> > > > >
>> > > > > fl=: 3 conew 'FlattenLayer'
>> > > > >
>> > > > > fc=: (12;3;'softmax';'adam';0.01) conew 'SimpleLayer'
>> > > > >
>> > > > > b3=: (0; 1 ;0.0001;2;0.01) conew 'BatchNorm'
>> > > > >
>> > > > > a3=: 'softmax' conew 'Activation'
>> > > > >
>> > > > >
>> > > > >
>> > > > > addLayer__pipe c1
>> > > > >
>> > > > > addLayer__pipe p1
>> > > > >
>> > > > > NB.addLayer__pipe b1
>> > > > >
>> > > > > addLayer__pipe a1
>> > > > >
>> > > > > addLayer__pipe c2
>> > > > >
>> > > > > NB.addLayer__pipe b2
>> > > > >
>> > > > > addLayer__pipe a2
>> > > > >
>> > > > > addLayer__pipe fl
>> > > > >
>> > > > > addLayer__pipe fc
>> > > > >
>> > > > > NB.addLayer__pipe b3
>> > > > >
>> > > > > addLayer__pipe a3
>> > > > >
>> > > > > require 'plot viewmat'
>> > > > > NB. check the input images (per channel)
>> > > > > NB. viewmat"2 A1
>> > > > > NB. viewmat"2 B1
>> > > > > NB. viewmat"2 C1
>> > > > >
>> > > > >
>> > > > > OUTPUT fit__pipe INPUT NB. <--- should get 100%ish accuracy after
>> > only a few iterations.
>> > > > > NB.
>> >
>> =========================================================================
>> > > > >
>> > > > >
>> > > > > Running the above doesn't prove much, as there is no training /
>> > testing set split. It is just to see *if* the training will push the
>> > networks parameters in the correct direction. Getting accurate
>> predictions
>> > on all the A,B,C images will at least show that the network is not doing
>> > anything completely wrong. It is also just useful as a playground to
>> see if
>> > different ideas work.
>> > > > >
>> > > > > You can test the accuracy with
>> > > > > OUTPUT -:"1 1 (=>./)"1 >{: predict__pipe INPUT
>> > > > >      On Thursday, April 18, 2019, 11:36:35 AM GMT+9, Brian Schott
>> <
>> > [email protected]> wrote:
>> > > > >
>> > > > >  I have renamed this message because the topic has changed, but
>> > considered
>> > > > > moving it to jchat as well. However I settled on jprogramming
>> > because there
>> > > > > are definitely some j programming issues to discuss.
>> > > > >
>> > > > > Jon,
>> > > > >
>> > > > > Your script code is beautifully commented and very valuable, imho.
>> > The lack
>> > > > > of an example has slowed down my study of the script, but now I
>> have
>> > some
>> > > > > questions and comments.
>> > > > >
>> > > > > I gather from your comments that the word tensor is used to
>> > designate a 4
>> > > > > dimensional array. That's new to me, but it is very logical.
>> > > > >
>> > > > > Your definition convFunc=: +/@:,@:* works very well. However, for
>> > some
>> > > > > reason I wish I could think of a way to defined convFunc in terms
>> of
>> > X=:
>> > > > > dot=: +/ . * .
>> > > > >
>> > > > > The main insight I have gained from your code is that (x u;.+_3 y)
>> > can be
>> > > > > used with x of shape 2 n where n>2 (and not just 2 2). This is
>> great
>> > > > > information. And that you built the convFunc directly into cf is
>> > also very
>> > > > > enlightening.
>> > > > >
>> > > > > I have created a couple of examples of the use of your function
>> `cf`
>> > to
>> > > > > better understand how it works. [The data is borrowed from the
>> fine
>> > example
>> > > > > at http://cs231n.github.io/convolutional-networks/#conv . Beware
>> > that the
>> > > > > dynamic example seen at the link changes everytime the page is
>> > refreshed,
>> > > > > so you will not see the exact data I present, but the shapes of
>> the
>> > data
>> > > > > are constant.]
>> > > > >
>> > > > > Notice that in my first experiments both `filter` and the RHA of
>> > cf"3 are
>> > > > > arrays and not tensors. Consequently(?) the result is an array,
>> not a
>> > > > > tensor, either.
>> > > > >
>> > > > >    i=: _7]\".;._2 (0 : 0)
>> > > > > 0 0 0 0 0 0 0
>> > > > > 0 0 0 1 2 2 0
>> > > > > 0 0 0 2 1 0 0
>> > > > > 0 0 0 1 2 2 0
>> > > > > 0 0 0 0 2 0 0
>> > > > > 0 0 0 2 2 2 0
>> > > > > 0 0 0 0 0 0 0
>> > > > > 0 0 0 0 0 0 0
>> > > > > 0 2 1 2 2 2 0
>> > > > > 0 0 1 0 2 0 0
>> > > > > 0 1 1 1 1 1 0
>> > > > > 0 2 0 0 0 2 0
>> > > > > 0 0 0 2 2 2 0
>> > > > > 0 0 0 0 0 0 0
>> > > > > 0 0 0 0 0 0 0
>> > > > > 0 0 0 1 2 1 0
>> > > > > 0 1 1 0 0 0 0
>> > > > > 0 2 1 2 0 2 0
>> > > > > 0 1 0 0 2 2 0
>> > > > > 0 1 0 1 2 2 0
>> > > > > 0 0 0 0 0 0 0
>> > > > > )
>> > > > >
>> > > > >    k =: _3]\".;._2(0 :0)
>> > > > > 1  0 0
>> > > > > 1 _1 0
>> > > > > _1 _1 1
>> > > > > 0 _1 1
>> > > > > 0  0 1
>> > > > > 0 _1 1
>> > > > > 1  0 1
>> > > > > 0 _1 0
>> > > > > 0 _1 0
>> > > > > )
>> > > > >
>> > > > >    $i NB. 3 7 7
>> > > > >    $k NB.  3 3 3
>> > > > >
>> > > > >    filter =: k
>> > > > >    convFunc=: +/@:,@:*
>> > > > >
>> > > > >    cf=: 4 :  '|:"2 |: +/ x filter&(convFunc"3 3);._3 y'
>> > > > >    (1 2 2,:3 3 3) cf"3 i NB. 3 3$1 1 _2 _2 3 _7 _3 1  0
>> > > > >
>> > > > > My next example makes both the `filter` and the RHA into tensors.
>> And
>> > > > > notice the shape of the result shows it is a tensor, also.
>> > > > >
>> > > > >    filter2 =: filter,:_1+filter
>> > > > >    cf2=: 4 :  '|:"2 |: +/ x filter2&(convFunc"3 3);._3 y'
>> > > > >    $ (1 2 2,:3 3 3) cf2"3 i,:5+i NB. 2 2 3 3
>> > > > >
>> > > > > Much of my effort regarding CNN has been studying the literature
>> that
>> > > > > discusses efficient ways of computing these convolutions by
>> > translating the
>> > > > > filters and the image data into flattened (and somewhat sparse}
>> > forms that
>> > > > > can be restated in matrix  formats. These matrices accomplish the
>> > > > > convolution and deconvolution as *efficient* matrix products. Your
>> > > > > demonstration of the way that j's ;._3 can be so effective
>> > challenges the
>> > > > > need for such efficiencies.
>> > > > >
>> > > > > On the other hand, I could use some help understanding how the 1
>> 0 2
>> > 3 |:
>> > > > > transpose you apply to `filter` is effective in the
>> backpropogation
>> > stage.
>> > > > > Part of my confusion is that I would have thought the transpose
>> > would have
>> > > > > been 0 1 3 2 |:, instead. Can you say more about that?
>> > > > >
>> > > > > I have yet to try to understand your verbs `forward` and
>> `backward`,
>> > but I
>> > > > > look forward to doing so.
>> > > > >
>> > > > > I could not find definitions for the following functions and
>> wonder
>> > if you
>> > > > > can say more about them, please?
>> > > > >
>> > > > > bmt_jLearnUtil_
>> > > > > setSolver
>> > > > >
>> > > > > I noticed that your definitions of relu and derivRelu were more
>> > complicated
>> > > > > than mine, so I attempted to test yours out against mine as
>> follows.
>> > > > >
>> > > > >
>> > > > >
>> > > > >    relu    =: 0&>.
>> > > > >    derivRelu =: 0&<
>> > > > >    (relu -: 0:`[@.>&0) i: 4
>> > > > > 1
>> > > > >    (derivRelu -: 0:`1:@.>&0) i: 4
>> > > > > 1
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > On Sun, Apr 14, 2019 at 8:31 AM jonghough via Programming <
>> > > > > [email protected]> wrote:
>> > > > >
>> > > > >>    I had a go writing conv nets in J.
>> > > > >> See
>> > > > >> https://github.com/jonghough/jlearn/blob/master/adv/conv2d.ijs
>> > > > >>
>> > > > >> This uses ;.3 to do the convolutions. Using a version of this ,
>> > with a
>> > > > >> couple of fixes/, I managed to get 88% accuracy on the cifar-10
>> > imageset.
>> > > > >> Took several days to run, as my algorithms are not optimized in
>> any
>> > way,
>> > > > >> and no gpu was used.
>> > > > >> If you look at the references in the above link, you may get some
>> > ideas.
>> > > > >>
>> > > > >> the convolution verb is defined as:
>> > > > >> cf=: 4 : 0
>> > > > >> |:"2 |: +/ x filter&(convFunc"3 3);._3 y
>> > > > >> )
>> > > > >>
>> > > > >> Note that since the input is an batch of images, each 3-d (width,
>> > height,
>> > > > >> channels), we are actually doing the whole forward pass over a 4d
>> > array,
>> > > > >> and outputting another 4d array of different shape, depending on
>> > output
>> > > > >> channels, filter width, and filter height.
>> > > > >>
>> > > > >> Thanks,
>> > > > >> Jon
>> > > > >>
>> > > > > Thank you,
>> > > > >
>> > > >
>> > > >
>> > > > ---
>> > > > This email has been checked for viruses by AVG.
>> > > > https://www.avg.com
>> > > >
>> > > >
>> ----------------------------------------------------------------------
>> > > > For information about J forums see
>> http://www.jsoftware.com/forums.htm
>> > > ----------------------------------------------------------------------
>> > > For information about J forums see
>> http://www.jsoftware.com/forums.htm
>> > > ----------------------------------------------------------------------
>> > > For information about J forums see
>> http://www.jsoftware.com/forums.htm
>> > ----------------------------------------------------------------------
>> > For information about J forums see http://www.jsoftware.com/forums.htm
>>
>>
>>
>> --
>>
>> Devon McCormick, CFA
>>
>> Quantitative Consultant
>> ----------------------------------------------------------------------
>> For information about J forums see http://www.jsoftware.com/forums.htm
>>
>> ----------------------------------------------------------------------
>> For information about J forums see http://www.jsoftware.com/forums.htm
>>
>
>
> --
>
> Devon McCormick, CFA
>
> Quantitative Consultant
>
>

-- 

Devon McCormick, CFA

Quantitative Consultant
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] convolutional neural network [was simplifying im2col]

Reply via email to