Ah. That is not the best method ever since it will fragment your memory -- you're creating and destroying thousands of progressively larger PDLs, which end up spread out over your workspace. You're also copying the first row 15,000 times, the second row 14,999 times, etc. When you work with objects that big you really have to think a bit about what your computer is actually doing, and not just assume memory is infinite.
You'd be better off slotting each row into a predeclared array: $matrix = PDL->new_from_specification(15000,15000); # or use zeroes() and then in your loop: $matrix->(:,($index)) .= pdl($p_arrayref); That generates the large array up front, and then uses computed assignment to slot each row into place. I've left out most of the code since I don't know whether you're generating the values or reading them in from a file. If you're reading them in, you might want to look at PDL::IO -- it probably has something that can help you. There are a lot of ways to speed up large-scale square matrix multiplication -- brute force, which works great for small to "midsized" matrices, is an O(n^3) operation, so as you see it gets big and bad fast. You should google Strassen's Algorithm, which is O(n^2.81) or so -- or even the Coppersmith-Winograd algorithm, which is O(n^2.38). I think one or both of those algorithms are in the Gnu Scientific Library. There is example code in the PDL distribution for linking GSL functions into your PDL code -- you could try that. On Sep 5, 2014, at 10:08 AM, Ronak Agrawal <ronagra...@gmail.com> wrote: > To create the matrix I am calling the following snippet for 15000 times with > array reference as parameter > { > $p_arrayref = $_[0]; > my $p_new = pdl ( [@$p_arrayref] ); > $matrix = $matrix->glue(1,$p_new); > } > > Since the brute force method $c = $a x $b will take 200 hours, can you guide > me with better approach.. > > > > On Fri, Sep 5, 2014 at 9:19 PM, Craig DeForest <defor...@boulder.swri.edu> > wrote: > If your matrix is not necessarily sparse, you will have to process it all > through memory. PDL is optimized for problems that fit in your machine's RAM > limit. 15000x15000 floats is 900 MB, which should fit within most machines. > (15000x15000 double-precision values is 1.8 GB, which should also be OK). > You'll need to set the global variable $PDL::BIGPDL to 1 to let Perl know you > plan to work with arrays that large. > > My laptop computer has 16GB of RAM. This works fine: > > use PDL; > $a = random(15000,15000); # generate 15000x15000 array of random numbers > $b = random(15000,15000); # generate another one > > If you're running out of memory you may be trying to do something silly like > read all the numbers in as Perl scalars...? > > On the other hand, this may take a while: > > $c = $a x $b; # brute-force matrix multiply -- ~200 hours to complete > > The reason is that the final step requires (8 * 15000 * 3 * 15000 * 15000) > memory > accesses. > > Finding eigenvalues of a 15000x15000 matrix is a nontrivial process. PDL has > an eigenvalue solver ("eigens") but it is a general purpose tool for small > matrices, it would take considerably longer than the age of the Universe to > find the eigenvalues of a 15000x15000 nonsparse matrix -- so your project > might be a little late if you use that. > > Working with large matrices is its own computational subject. PDL makes a > nice framework for it, but for any serious operations you can't just use the > kind of general purpose tools that work fine on (say) a 10x10 matrix. > > > > On Sep 5, 2014, at 9:12 AM, Ronak Agrawal <ronagra...@gmail.com> wrote: > >> Thank You Sir for the early response. >> >> I am new to Perl and have been assigned project on Topic Modeling where I >> have to search, browse and find information from large archives of texts. >> >> Matrix operation is one of the operation and as per requirement my matrix >> may be sparse or dense. Is it possible for you help me with both the cases. >> >> More to that can you tell me some good methods to handle large data in Perl. >> >> Once again thank you for the response >> >> >> On Fri, Sep 5, 2014 at 7:36 PM, Craig DeForest <defor...@boulder.swri.edu> >> wrote: >> Glad to help. First, a few questions. Is the matrix sparse? (i.e. are >> less than, say 10^-3 of the elements nonzero?) How close to tridiagonal is >> it? >> >> >> On Sep 5, 2014, at 6:27 AM, Ronak Agrawal <ronagra...@gmail.com> wrote: >> >>> Hi >>> I am doing a project in Topic Modelling which involves large matrix >>> operations. >>> I have a sql database from where I have to generate 15000 x 15000 matix - >>> transform and obtain A'A.Later I have to find Eigen Values and Eigen >>> Vectors. >>> Can you suggest me ways to do this in Perl.I get "Out of Memory" while >>> storing the matrix in memory. >>> Your input will help in handling big data and therby making my project >>> success >>> Thank You >>> Ronak >>> _______________________________________________ >>> Perldl mailing list >>> Perldl@jach.hawaii.edu >>> http://mailman.jach.hawaii.edu/mailman/listinfo/perldl >> >> > >
_______________________________________________ Perldl mailing list Perldl@jach.hawaii.edu http://mailman.jach.hawaii.edu/mailman/listinfo/perldl