Ok, i got u0 working. The problem is of course that something called BBt job is to be coerced to have 1 reducer (it's fine, every mapper won't yeld more than upper-triangular matrix of k+p x k+p geometry, so even if you end up having thousands of them, reducer would sum them up just fine.
it worked before apparently because configuration hold 1 reducer by default if not set explicitly, i am not quite sure if that's something in hadoop mr client or mahout change that now precludes it from working. anyway, i got a patch (really a one-liner) and an example equivalent to yours worked fine for me with 3 reducers. Also, in the tests, it also requests 3 reducers, but the reason it works in tests and not in distributed mapred is because local mapred doesn't support multiple reducers. I investigated this issue before and apparently there were a couple of patches floating around but for some reason those changes did not take hold in cdh3u0. I will publish patch in a jira shortly and will commit it Sunday-ish. Thanks. -d On Fri, Aug 5, 2011 at 7:06 PM, Eshwaran Vijaya Kumar < [email protected]> wrote: > OK. So to add more info to this, I tried setting the number of reducers to > 1 and now I don't get that particular error. The singular values and left > and right singular vectors appear to be correct though (verified using > Matlab). > > On Aug 5, 2011, at 1:55 PM, Eshwaran Vijaya Kumar wrote: > > > All, > > I am trying to test Stochastic SVD and am facing some errors where it > would be great if someone could clarifying what is going on. I am trying to > feed the solver a DistributedRowMatrix with the exact same parameters that > the test in LocalSSVDSolverSparseSequentialTest uses, i.e, Generate a 1000 > X 100 DRM with SequentialSparseVectors and then ask for blockHeight 251, p > (oversampling) = 60, k (rank) = 40. I get the following error: > > > > Exception in thread "main" java.io.IOException: Unexpected overrun in > upper triangular matrix files > > at > org.apache.mahout.math.hadoop.stochasticsvd.SSVDSolver.loadUpperTriangularMatrix(SSVDSolver.java:471) > > at > org.apache.mahout.math.hadoop.stochasticsvd.SSVDSolver.run(SSVDSolver.java:268) > > at com.mozilla.SSVDCli.run(SSVDCli.java:89) > > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) > > at com.mozilla.SSVDCli.main(SSVDCli.java:129) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > at java.lang.reflect.Method.invoke(Method.java:597) > > at org.apache.hadoop.util.RunJar.main(RunJar.java:186) > > > > Also, I am using CDH3 with Mahout recompiled to work with CDH3 jars. > > > > Thanks > > Esh > > > >
