My recent postings concerning mapped variables that contain boxed data are
motivated by an application I am building.
Several months ago I posted an item about the time non-linearity of using item
amend on ordinary boxed arrays. Receiving no feedback I developed
cumbersome workarounds until I noticed that for mapped variables item
amend is pretty much linear.
In the following results, which were obtained by running the code
lower down, it is apparent that mapped variables are better.
The speed of item amend with large mapped arrays containing boxed data
is very good.
Of course, it would be nice if all the primitives behaved properly with
mapped variables.
But in practice I think that all I need (for the moment) is a way to change
the size of the array effeciently - namely:
mapped =: N{.mapped
where N may be greater than or less than #mapped.
Up till now the only thing I have found that works is to use two mapped
variables and an explicit function that loops and copies the data from the
old to the new. The drawback is that I lose a lot of address
space (maybe as much as 50%) - although the speed of copying is really good.
I have looked at the code in jmf.ijs but nothing jumps out at me as a
way forward.
Suggestions would be appreciated.
Here are some timings. The count column indicates the size of the
boxed array. eg 1000$a:
The time value is how long it took to replace each element in the
boxed array with a character vector of random with a length between 0
and 9999.
The difference is rather striking.
It would be great if either Eric or Roger could give a brief explanation as to
why the differences are so pronounced.
count time
NORMAL 100 0.0101963
NORMAL 1000 0.238853
NORMAL 2000 1.08816
NORMAL 3000 3.45866
NORMAL 4000 10.3032
NORMAL 5000 16.3996
NORMAL 10000 98.3375
MAPPED 100 0.0115663
MAPPED 1000 0.11231
MAPPED 2000 0.12923
MAPPED 3000 0.126278
MAPPED 4000 0.162611
MAPPED 5000 0.363064
MAPPED 10000 0.460686
require 'jmf'
unmapall_jmf_ ''
createjmf_jmf_ 'testdata';150e6
map_jmf_ 'mdata';'testdata'
chardata =: a.{~ 65+?1000 10000$60 NB. some simple character test data
time =: (6!:2)
show =: (1!:2)&2
trial =: 100 1000 2000 3000 4000 5000 10000
NN =: 10000
N =: 1000
NB. define one verb that works with an in-memory variable,
NB. and another with a mapped file.
normal =: 3 : 'y=.y$a:for_ii. ?.~#y do. y
=.(<(?.NN){.(N|ii){chardata)ii}y end. 1'
map =: 3 : 'mdata=:y$a:for_ii. ?.~#y=.mdata do. y
=.(<(?.NN){.(N|ii){chardata)ii}y end. 1'
NB. now run the trials
'NORMAL ' 1 : 'for_k. trial do. show x ,": k, time''normal k'' end. show i.0'
'MAPPED ' 1 : 'for_k. trial do. show x ,": k, time ''map k'' end. show i.0'
And finally just for fun:
mdata=:;:'F S J D Q U e Q P'
mdata
+-+-+-+-+-+-+-+-+-+
|F|S|J|D|Q|U|e|Q|P|
+-+-+-+-+-+-+-+-+-+
mdata =: 4{.mdata
mdata
+-+-+-+-+
|E|R|I|C|
+-+-+-+-+
(but I think it works differently in Darwin...)
Regards
David
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm