Hi,

I am trying to run ssvd on amazon EMR but, I am getting a
LeaseExpriedException during the execution of the ABt job. I posted about
my problem to AWS forum
(here<http://forums.aws.amazon.com/thread.jspa?threadID=126294&tstart=0>)
as I thought first that it could be a problem with EMR. Now, the reply I
got indicates that it's a problem with the ssvd implementation. I
successfully used ssvd before for decomposing many other datasets with
different parameter settings. Is it possible that for only one dataset I
get that exception?

The dataset I am used here is pubmed abstracts 8.2m x 141k. The ssvd params
I am using are: rank = 100, oversampling = 15, power iterations = 2, and
ABt blocksize = 10000. The dataset is partitioned into 36 blocks on a 10
nodes EMR cluster.

My jar just runs the code below with the arguments: /user/data/pubmed.dat
/user/pubmed/ssvd100/tmp 8200000 141043 /user/pubmed/ssvd100/out 10000 100
15 16 2

        Configuration conf = new Configuration();
        DistributedRowMatrix A = new DistributedRowMatrix(new Path(args[0]),
                new Path(args[1]), Integer.parseInt(args[2]),
                Integer.parseInt(args[3]));
        A.setConf(conf);

        SSVDSolver ssvdSolver = new SSVDSolver(conf,
                new Path[] { A.getRowPath() }, new Path(args[4]),
                Integer.parseInt(args[5]), Integer.parseInt(args[6]),
                Integer.parseInt(args[7]), Integer.parseInt(args[8]));
        ssvdSolver.setQ(Integer.parseInt(args[9]));
        ssvdSolver.setComputeV(true);
        ssvdSolver.run();



thanks,

--ahmed

Reply via email to