Hi, I am trying to run ssvd on amazon EMR but, I am getting a LeaseExpriedException during the execution of the ABt job. I posted about my problem to AWS forum (here<http://forums.aws.amazon.com/thread.jspa?threadID=126294&tstart=0>) as I thought first that it could be a problem with EMR. Now, the reply I got indicates that it's a problem with the ssvd implementation. I successfully used ssvd before for decomposing many other datasets with different parameter settings. Is it possible that for only one dataset I get that exception?
The dataset I am used here is pubmed abstracts 8.2m x 141k. The ssvd params I am using are: rank = 100, oversampling = 15, power iterations = 2, and ABt blocksize = 10000. The dataset is partitioned into 36 blocks on a 10 nodes EMR cluster. My jar just runs the code below with the arguments: /user/data/pubmed.dat /user/pubmed/ssvd100/tmp 8200000 141043 /user/pubmed/ssvd100/out 10000 100 15 16 2 Configuration conf = new Configuration(); DistributedRowMatrix A = new DistributedRowMatrix(new Path(args[0]), new Path(args[1]), Integer.parseInt(args[2]), Integer.parseInt(args[3])); A.setConf(conf); SSVDSolver ssvdSolver = new SSVDSolver(conf, new Path[] { A.getRowPath() }, new Path(args[4]), Integer.parseInt(args[5]), Integer.parseInt(args[6]), Integer.parseInt(args[7]), Integer.parseInt(args[8])); ssvdSolver.setQ(Integer.parseInt(args[9])); ssvdSolver.setComputeV(true); ssvdSolver.run(); thanks, --ahmed