First, 50 million integers requires 200MB to store in 32bit J and 400MB to store in 64 bit J. Floating point representation will also require 400MB.
And, you are going to need enough memory to store intermediate results. A good rule of thumb (when you are not ready to perform detailed analysis) is: make sure you have at least 5 times the amount of memory as the largest intermediate value in an expression. Complicated expressions might require additional memory, of course. Here, that's 2MB. One place you can try to save memory is by not retaining intermediate results in variables. Also, I believe your moving average definition is using O(n^2) time where n is the length of the data you are working with. This is much faster (linear time), but in my tests it uses slightly more memory (7!:2 reported memory use of 3x my argument size, instead of 2.5x, for the test I was running). ma=: +/\ % #@] FYI, -- Raul On Tue, May 8, 2012 at 6:56 AM, Joe Bohart <jboh...@gmail.com> wrote: > Hi J'ers, > > I'm trying to perform calculations on massive data sets using mapped files > and after searching the forum/essays am stuck. > > I've load 50 million integer digits of pi and trying to do a moving average > on them. I 've created two mapped files one for the data and one for the > calculated results, but I get out of memory errors. I'm not seeing what I'm > doing wrong here. > > My goal is to work with floating point numbers eventually, but i'd figure > i'd start with integers first, since there are more examples for integers > than floats in the 'mapped files' Labs (if anyone has the code for the > example mention in the 67 of 68 slide of the Lab on mapped files using NYSE > stock prices - I'd love to see it). > > load 'jmf files dir' > NB. data from http://zenwerx.com/projects/pi-digits/pi/ > NB. used perl to write each digits on 1 line of file data > pidata =: fread '/home/joe/pi-study/data' > NB. type is 2 (literal) > 3!:0 pidata > NB. convert to 4-byte integers > piInt =: _2 (3!:4) pidata > NB. shape is 50,000,000 > $piInt > ]sz=:7!:5 <'piInt' > NB. define moving average > ma =: mean \ > NB. out of memory ma, expected since storing results in ram > smoothPi =: 100 ma piInt > > NB. create jmf for results > fn =: jpath '~temp/pi-results.jmf' > ]sz=: 7!:5 <'piInt' > createjmf_jmf_ fn;sz > JFL map_jmf_ 'piResults'; fn > piResults =. 100 ma piInt > NB. still get out of memory! > > Much thanks ! > Joe > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm