Hi Sab, Thanks for your response. We’re thinking of trying a bigger cluster, because we just started with 2 nodes. What we really want to know is whether the code will scale up with larger matrices and more nodes. I’d be interested to hear how large a matrix multiplication you managed to do?
Is there an alternative you’d recommend for calculating similarity over a large dataset? Thanks, Eilidh On 13 Nov 2015, at 09:55, Sabarish Sasidharan <sabarish.sasidha...@manthan.com> wrote: > We have done this by blocking but without using BlockMatrix. We used our own > blocking mechanism because BlockMatrix didn't exist in Spark 1.2. What is the > size of your block? How much memory are you giving to the executors? I assume > you are running on YARN, if so you would want to make sure your yarn executor > memory overhead is set to a higher value than default. > > Just curious, could you also explain why you need matrix multiplication with > transpose? Smells like similarity computation. > > Regards > Sab > > On Thu, Nov 12, 2015 at 7:27 PM, Eilidh Troup <e.tr...@epcc.ed.ac.uk> wrote: > Hi, > > I’m trying to multiply a large squarish matrix with its transpose. Eventually > I’d like to work with matrices of size 200,000 by 500,000, but I’ve started > off first with 100 by 100 which was fine, and then with 10,000 by 10,000 > which failed with an out of memory exception. > > I used MLlib and BlockMatrix and tried various block sizes, and also tried > switching disk serialisation on. > > We are running on a small cluster, using a CSV file in HDFS as the input data. > > Would anyone with experience of multiplying large, dense matrices in spark be > able to comment on what to try to make this work? > > Thanks, > Eilidh > > > -- > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > > > > > -- > > Architect - Big Data > Ph: +91 99805 99458 > > Manthan Systems | Company of the year - Analytics (2014 Frost and Sullivan > India ICT) > +++
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
--------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org