Dyad (+/%#)\ is supported by special code (see http://jsoftware.com/help/release/infixavg.htm). However, testing shows that mean\ is not, so you'll want to use the expanded definition (or (mean\ f.), which also works).
The special code allows me to run a moving average on 5e7?@$10 using only 0.5G of ram, while mean\ requires 2G and much more time. I haven't used memory-mapped files, though, so this is more of a general recommendation than a solution to your problem. Marshall On Tue, May 8, 2012 at 9:40 AM, Raul Miller <rauldmil...@gmail.com> wrote: > First, 50 million integers requires 200MB to store in 32bit J and > 400MB to store in 64 bit J. Floating point representation will also > require 400MB. > > And, you are going to need enough memory to store intermediate > results. A good rule of thumb (when you are not ready to perform > detailed analysis) is: make sure you have at least 5 times the amount > of memory as the largest intermediate value in an expression. > Complicated expressions might require additional memory, of course. > Here, that's 2MB. > > One place you can try to save memory is by not retaining intermediate > results in variables. > > Also, I believe your moving average definition is using O(n^2) time > where n is the length of the data you are working with. This is much > faster (linear time), but in my tests it uses slightly more memory > (7!:2 reported memory use of 3x my argument size, instead of 2.5x, for > the test I was running). > > ma=: +/\ % #@] > > FYI, > > -- > Raul > > On Tue, May 8, 2012 at 6:56 AM, Joe Bohart <jboh...@gmail.com> wrote: > > Hi J'ers, > > > > I'm trying to perform calculations on massive data sets using mapped > files > > and after searching the forum/essays am stuck. > > > > I've load 50 million integer digits of pi and trying to do a moving > average > > on them. I 've created two mapped files one for the data and one for the > > calculated results, but I get out of memory errors. I'm not seeing what > I'm > > doing wrong here. > > > > My goal is to work with floating point numbers eventually, but i'd figure > > i'd start with integers first, since there are more examples for integers > > than floats in the 'mapped files' Labs (if anyone has the code for the > > example mention in the 67 of 68 slide of the Lab on mapped files using > NYSE > > stock prices - I'd love to see it). > > > > load 'jmf files dir' > > NB. data from http://zenwerx.com/projects/pi-digits/pi/ > > NB. used perl to write each digits on 1 line of file data > > pidata =: fread '/home/joe/pi-study/data' > > NB. type is 2 (literal) > > 3!:0 pidata > > NB. convert to 4-byte integers > > piInt =: _2 (3!:4) pidata > > NB. shape is 50,000,000 > > $piInt > > ]sz=:7!:5 <'piInt' > > NB. define moving average > > ma =: mean \ > > NB. out of memory ma, expected since storing results in ram > > smoothPi =: 100 ma piInt > > > > NB. create jmf for results > > fn =: jpath '~temp/pi-results.jmf' > > ]sz=: 7!:5 <'piInt' > > createjmf_jmf_ fn;sz > > JFL map_jmf_ 'piResults'; fn > > piResults =. 100 ma piInt > > NB. still get out of memory! > > > > Much thanks ! > > Joe > > ---------------------------------------------------------------------- > > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm