so we need to manually construct the verb name with the correct locale. Thanks.
This seems to work.
in_z_=:1 :0 NB.accepts simple named verb
l=.,&(,&'_'@:,@:>)
(u`'' l coname'')~
)
c=:0
t=:3 :0
c + y
)
a=:1 :0
echo u`''
u y
)
f=:3 :0
C=.conew'base'
c__C=:2
z=.(t in__C a;*:@(t in_
Certainly simd in avx helps, some more tests,
NB. non-avx j806
28.124 2.68438e8
1.12953 5.36873e8
1
NB. non-avx j806 (using sse2)
21.5087 2.68438e8
1.13762 5.36873e8
1
NB. avx j806
19.4048 2.68438e8
1.13984 5.36873e8
1
The non-avx j806 runs 4x faster than j602.
The advantage of avx over sse2 is
ok, error is in a
loc_z_ =: (,&'_'@[ ,&'_'@, ":@>@])"1 0 boxopen
locs_z_ =: 1 : 'm loc 18 !: 5 '
f=:3 :0
C=.conew'base'
c__C=:1
z=. 't' locs__C~ a 0codestroy__C''
z
)
From: Xiao-Yong Jin
To: "programm...@jsoftware.com"
Sent: Friday, April 21, 2017 9:
NB.BEGIN file 't'
c=:0
t=:3 :0
c + y
)
a=:1 :0
echo u`''
u y
)
f=:3 :0
C=.conew'base'
c__C=:1
z=.t__C a 0
codestroy__C''
z
)
NB.END file 't'
NB.BEGIN jconsole session
load't'
f''
┌┐
│t__C│
└┘
|value error: C
| u y
NB.END jconsole session
> On Apr 21, 2017, at 9
I don't expect there to be any complaint different than if you ran these lines
in console.
By the time adv sees its parameters, C has been defined, and so no error,
unless there was a recent optimization that breaks this.
From: Xiao-Yong Jin
To: "programm..
'verbInClass__C f.' would create an anonymous verb that knows nothing about the
locale.
> On Apr 21, 2017, at 8:47 PM, Henry Rich wrote:
>
> will
>
> verbInClass__C f. adv y
>
> do?
>
> Henry Rich
>
> On 4/21/2017 8:37 PM, Xiao-Yong Jin wrote:
>> In an explicit definition of a verb, if I ha
will
verbInClass__C f. adv y
do?
Henry Rich
On 4/21/2017 8:37 PM, Xiao-Yong Jin wrote:
In an explicit definition of a verb, if I have, for example,
f=:3 :0
C=.conew'SomeClass'
verbInClass__C adv y
)
The anonymous verb created by 'verbInClass__C adv' is going to complain about
the unkn
I suspect avx512 will take some years to become commoditized. Right now, I
don't have any hardware to try.
On 22 Apr, 2017 2:57 am, "Jens Pfeiffer" wrote:
> Sorry for being late to the party.
>
> AVX has been around for quite a while now. The new kid on the block
> seems to be AVX-512:
> https:
In an explicit definition of a verb, if I have, for example,
f=:3 :0
C=.conew'SomeClass'
verbInClass__C adv y
)
The anonymous verb created by 'verbInClass__C adv' is going to complain about
the unknown C.
How do you actually pass the 'verbInClass__C' in this case?
---
Sorry for being late to the party.
AVX has been around for quite a while now. The new kid on the block
seems to be AVX-512:
https://www.codeproject.com/Articles/1182515/Vectorization-Opportunities-for-Improved-Perform
Regards, Jens
Am 12.03.2017 um 17:35 schrieb Eric Iverson:
> The first AVX
Actually both AVX and cache are important: cache ordering is necessary
to read the input faster, then AVX instructions to process them. It
keeps a single CPU pretty busy, so BLAS must be doing something special.
Henry Rich
On 4/21/2017 12:10 PM, bill lam wrote:
Improvement coming from avx is
All i did was delete the loop construct lines for both while/for so that
the code runs straight thru one time for the non-loop version. Then I did
1000 (6!:2) 'f tlpmcmc 1e1'
for both loop/non-loop versions to get the average elapsed time figures
reported above.
For non-loop version the passed
Improvement coming from avx is not that important, the major improvement in
inner product is the algorithm had been tuned to keep cache hot. Using sse2
can also achieve similar improvement factor when using the new codes.
blas is super optimized, it might have running in multiple threads, so that
No, just background loading had changed. Actual difference due to |:
should be negligible.
On 21 Apr, 2017 11:37 pm, "Xiao-Yong Jin" wrote:
Thanks. It's interesting to see how far the avx has come along.
And the |: takes more than 20% of the time? I guess this is something that
could be impr
Thanks. It's interesting to see how far the avx has come along.
And the |: takes more than 20% of the time? I guess this is something that
could be improved.
> On Apr 21, 2017, at 7:42 AM, bill lam wrote:
>
> Opp, output from dgemm should be transposed to row major.
>
> dgemm=: 'liblapack.so
Can you give an example as how you changed the loops to no-loops?
I'm just curious.
> On Apr 21, 2017, at 11:04 AM, michael.goodr...@gmail.com wrote:
>
> Henry - regarding the neural network code I posted in wierdness #2, some
> quick and dirty timing tests:
>
> No loops. Loops
The 'no loops' code is the original with the while/for constructs removed.
Sent from my iPad
> On Apr 21, 2017, at 11:04, michael.goodr...@gmail.com wrote:
>
> Henry - regarding the neural network code I posted in wierdness #2, some
> quick and dirty timing tests:
>
> No loops.
Henry - regarding the neural network code I posted in wierdness #2, some quick
and dirty timing tests:
No loops. Loops. Ratio
805: 0.00181 0.01067 5.9
806: 0.00191 0.025 13
Ratio: 1.062.34.2.21
Loops means 10
Opp, output from dgemm should be transposed to row major.
dgemm=: 'liblapack.so.3 dgemm_ > n *c *c *i *i *i *d *d *i *d *i *d *d *i'&cd
mm=: 4 : 0
k=. ,{.$x
c=. (k,k)$1.5-1.5
dgemm (,'T');(,'T');k;k;k;(,2.5-1.5);x;k;y;k;(,1.5-1.5);c;k
|:c
)
'A B'=:0?@$~2,,~4096
echo timespacex'c1=: A+/ .*B'
echo
19 matches
Mail list logo