I think the distance matrix version below is about as good
as it gets with these basic strategies.
fwiw,
Alan Isaac
def dist(A,B):
rowsA, rowsB = A.shape[0], B.shape[0]
distanceAB = empty( [rowsA,rowsB] , dtype=float)
if rowsA <= rowsB:
temp = empty_like(B)
for i in r
Tim Hochberg wrote:
>Sebastian Beca wrote:
>
>
>
>>I just ran Alan's script and I don't get consistent results for 100
>>repetitions. I boosted it to 1000, and ran it several times. The
>>faster one varied alot, but both came into a ~ +-1.5% difference.
>>
>>When it comes to scaling, for my pro
Sebastian Beca wrote:
>I just ran Alan's script and I don't get consistent results for 100
>repetitions. I boosted it to 1000, and ran it several times. The
>faster one varied alot, but both came into a ~ +-1.5% difference.
>
>When it comes to scaling, for my problem(fuzzy clustering), N is the
>
I just ran Alan's script and I don't get consistent results for 100
repetitions. I boosted it to 1000, and ran it several times. The
faster one varied alot, but both came into a ~ +-1.5% difference.
When it comes to scaling, for my problem(fuzzy clustering), N is the
size of the dataset, which sh
On Sun, 18 Jun 2006, Tim Hochberg apparently wrote:
> Alan G Isaac wrote:
>> On Sun, 18 Jun 2006, Sebastian Beca apparently wrote:
>>> def dist():
>>> d = zeros([N, C], dtype=float)
>>> if N < C: for i in range(N):
>>> xy = A[i] - B d[i,:] = sqrt(sum(xy**2, axis=1))
>>> return d
>>> else
Alan G Isaac wrote:
>On Sun, 18 Jun 2006, Sebastian Beca apparently wrote:
>
>
>>def dist():
>>d = zeros([N, C], dtype=float)
>>if N < C: for i in range(N):
>> xy = A[i] - B d[i,:] = sqrt(sum(xy**2, axis=1))
>> return d
>>else:
>> for j in range(C):
>> xy = A - B[j] d[:,j] = sqrt(sum(xy**2, axi
On Sun, 18 Jun 2006, Sebastian Beca apparently wrote:
> def dist():
> d = zeros([N, C], dtype=float)
> if N < C: for i in range(N):
> xy = A[i] - B d[i,:] = sqrt(sum(xy**2, axis=1))
> return d
> else:
> for j in range(C):
> xy = A - B[j] d[:,j] = sqrt(sum(xy**2, axis=1))
> return d
But that
I checked the matlab version's code and it does the same as discussed
here. The only thing to check is to make sure you loop around the
shorter dimension of the output array. Speedwise the Matlab code still
runs about twice as fast for large sets of data (by just taking time
by hand and comparing)
Alex Cannon wrote:
> How about this?
>
> def d5():
> return add.outer(sum(A*A, axis=1), sum(B*B, axis=1)) - \
> 2.*dot(A, transpose(B))
You might lose some precision with that approach, so the OP should compare
results and timings to look at the tradeoffs.
--
Rob
How about this?
def d5():
return add.outer(sum(A*A, axis=1), sum(B*B, axis=1)) - \
2.*dot(A, transpose(B))
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/list
Hi,
> def d4():
> d = zeros([4, 1000], dtype=float)
> for i in range(4):
> xy = A[i] - B
> d[i] = sqrt( sum(xy**2, axis=1) )
> return d
>
> Maybe there's another alternative to d4?
> Thanks again,
I think this is the fastest you can get. Maybe it would be nicer to use
Please replace:
C = 4
N = 1000
> d = zeros([C, N], dtype=float)
BK.
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Thanks! Avoiding the inner loop is MUCH faster (~20-300 times than the
original). Nevertheless I don't think I can use hypot as it only works
for two dimensions. The general problem I have is:
A = random( [C, K] )
B = random( [N, K] )
C ~ 1-10
N ~ Large (thousands, millions.. i.e. my dataset)
K ~
Christopher Barker wrote:
>Bruce Southey wrote:
>
>
>>Please run the exact same code in Matlab that you are running in
>>NumPy. Many of Matlab functions are very highly optimized so these are
>>provided as binary functions. I think that you are running into this
>>so you are not doing the correc
Bruce Southey wrote:
> Please run the exact same code in Matlab that you are running in
> NumPy. Many of Matlab functions are very highly optimized so these are
> provided as binary functions. I think that you are running into this
> so you are not doing the correct comparison
He is doing the cor
Hi,
Please run the exact same code in Matlab that you are running in
NumPy. Many of Matlab functions are very highly optimized so these are
provided as binary functions. I think that you are running into this
so you are not doing the correct comparison
So the ways around it are to write an extensi
Sebastian Beca wrote:
>Hi,
>I'm working with NumPy/SciPy on some algorithms and i've run into some
>important speed differences wrt Matlab 7. I've narrowed the main speed
>problem down to the operation of finding the euclidean distance
>between two matrices that share one dimension rank (dist in M
Hi,
I'm working with NumPy/SciPy on some algorithms and i've run into some
important speed differences wrt Matlab 7. I've narrowed the main speed
problem down to the operation of finding the euclidean distance
between two matrices that share one dimension rank (dist in Matlab):
Python:
def dtest()
Hi,
On Fri, Jun 16, 2006 at 08:28:18AM +0200, Johannes Loehnert wrote:
> Hi,
>
> def dtest():
> A = random( [4,2])
> B = random( [1000,2])
>
> # drawback: memory usage temporarily doubled
> # solution see below
> d = A[:, newaxis, :] - B[newaxis, :, :]
Unless I'm wrong, one
Hi,
def dtest():
A = random( [4,2])
B = random( [1000,2])
# drawback: memory usage temporarily doubled
# solution see below
d = A[:, newaxis, :] - B[newaxis, :, :]
# written as 3 expressions for more clarity
d = sqrt((d**2).sum(axis=2))
return d
def dtest_lowmem(
Hi Sebastian,
I am not sure if there is a function already defined in numpy, but
something like this may be what you are after
def distance(a1, a2):
return sqrt(sum((a1[:,newaxis,:] - a2[newaxis,:,:])**2, axis=2))
The general idea is to avoid loops if you want the code to execute
fast. I hop
Hi,
I'm working with NumPy/SciPy on some algorithms and i've run into some
important speed differences wrt Matlab 7. I've narrowed the main speed
problem down to the operation of finding the euclidean distance
between two matrices that share one dimension rank (dist in Matlab):
Python:
def dtest()
22 matches
Mail list logo