Hi Glyn, Your suggestions to compute the mahalanobis distance work perfect. However, I have a follow up question. For large data sets, I am getting a MemoryError when running "mahdist = np.sum(np.sum(delta[None,:,:,:] * VI[:,:,None,None],axis=1) * delta,axis=0)"
# pro is list with raster names s = len(pro) dat_pro = None layer = garray.array() r, c = layer.shape for i, map in enumerate(pro): layer.read(map, null=np.nan) if dat_pro is None: dat_pro = np.empty((s, r, c), dtype=np.double) dat_pro[i,:,:] = layer del layer delta = dat_pro - stat_mean[:,None,None] mahdist = np.sum(np.sum(delta[None,:,:,:] * VI[:,:,None,None],axis=1) * delta, axis=0) stat_mahal = garray.array() stat_mahal[...] = mahdist I guess this is because the calculations are done in-memory? Any way to avoid this memory problem when using large data sets (something like working with memmap objects?) Cheers, Paulo On Mon, Feb 2, 2015 at 7:19 PM, Glynn Clements <gl...@gclements.plus.com> wrote: > > Paulo van Breugel wrote: > > > > The second version: > > > > > > > > VI = np.linalg.inv(covar) > > > > > delta = dat_ref - stat_mean[:,None,None] > > > > > m = np.sum(np.sum(delta[:,:,:,None] * delta[:,:,None,:] * > VI[None,None,:,:],axis=-1),axis=-1) > > > > > stat_mah = np.sqrt(m) > > > > > > should work with delta being a 3-D array. > > > > > > > Yes, but when running the lines above, I am getting the same error > message: > > > > Traceback (most recent call last): > > File "<stdin>", line 1, in <module> > > ValueError: operands could not be broadcast together with shapes > > (3,77,78,78) (1,1,3,3) > > Well, it's not quite the same; both operands now have the same number > of dimensions, but they're not compatible (essentially, for each > dimension either both values should be the same or one of them should > be one). > > > >From what I can see from broadcasting rules, there is mismatch in the > last > > and fore-last dimensions of VI[None,None,:,:] compared to the other two? > > > > delta[:,:,:,None].shape > > (3, 77, 78, 1) > > > > delta[:,:,None,:].shape > > (3, 77, 1, 78) > > > > VI[None,None,:,:].shape > > (1, 1, 3, 3) > > The problem was that delta is 3x77x78 whereas the code assumed that it > would be 77x78x3. > > > I am really have to get a better grasp of working with these arrays, but > in > > any case, after a bit trial and error, I came up with the following to > > compute m. > > > > m = np.sum(np.sum(delta[None,:,:,:] * VI[:,:,None,None],axis=1) * delta, > > axis=0) > > > > > > This does, in my limited testing, gives the same result as using the loop > > with > > > > pow(distance.mahalanobis(cell_ref, stat_mean, VI),2). > > > > > > to be sure, does the above makes sense to you? > > I believe that's correct. > > -- > Glynn Clements <gl...@gclements.plus.com> >
_______________________________________________ grass-dev mailing list grass-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/grass-dev