is the matrix by any chance constructed so that it may have rank < k? I think MR code is not checking for that.
In spark shell i have : mahout> val a = dense( (0,0),(0,0) ) a: org.apache.mahout.math.DenseMatrix = { 0 => {} 1 => {} } mahout> svd(a) res0: (org.apache.mahout.math.Matrix, org.apache.mahout.math.Matrix, org.apache.mahout.math.DenseVector) = ({ 0 => {0:1.0} 1 => {1:1.0} },{ 0 => {0:-1.0} 1 => {1:-1.0} },{}) But : mahout> ssvd(a,2,0) java.lang.AssertionError: assertion failed: Rank-deficiency detected during s-SVD or mahout> val drmA = drmParallelize(a,2) mahout> dssvd(drmA, k=2) java.lang.IllegalArgumentException: R is rank-deficient. the MR version doesn't check for these effects and it may create some degenerate results, although i thought those should be 0s, at least when -q=0. I am not sure for -q=1,2... On Thu, Oct 30, 2014 at 10:35 PM, Yang <teddyyyy...@gmail.com> wrote: > i am talking about the MR one. > > thanks > yang > On Oct 30, 2014 8:16 PM, "Dmitriy Lyubimov" <dlie...@gmail.com> wrote: > > > This is not a known problem... > > > > there are few ssvd here, sequential, MR and spark one. for the record, > > which one are you running? > > > > > > > > On Thu, Oct 30, 2014 at 4:37 PM, Yang <teddyyyy...@gmail.com> wrote: > > > > > we are running ssvd on a dataset (this one is relatively small, with > 8000 > > > rows, number of columns is 64 ), we ran it with rank = 58, since > > sampling > > > p=5. > > > > > > the result had NaN on multiple columns. > > > > > > why would this appear ? > > > > > > I am now running with lower rank=20 , to see if it goes away. > > > > > > > > > Thanks > > > Yang > > > > > >