date:20090316

Re: [Numpy-discussion] Overlapping ranges

2009-03-16 Thread josef . pktd

On Mon, Mar 16, 2009 at 5:29 PM, Robert Kern  wrote:
> 2009/3/16 Peter Saffrey :
>
>> At the moment, I'm using a fairly naive approach that finds roughly in the
>> genome (which gene) each point might be and then checking it against the
>> bins in that gene. If I split the problem into chromosomes, I feel sure
>> there must be some super-fast matrix approach I can apply using numpy, but
>> I'm struggling a bit. Can anybody suggest something?
>
> You probably need something algorithmically better, like interval
> trees. There are a couple of C/Python implementations floating around.
>

If I understand your problem correctly, then with a smaller scaled
problem something like this should work
{{{
import numpy as np

B = np.array([[1,3],[2,5],[7,10], [6,15],[14,20]]) # bins
P = np.c_[np.arange(1,16), 4+np.arange(1,16)]  # points

#mask = (~(P[:,0:1]>D[:,1:2].T)) * (~(P[:,1:2]B[:,1:2].T), (P[:,1:2]http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Overlapping ranges

2009-03-16 Thread Robert Kern

2009/3/16 Peter Saffrey :

> At the moment, I'm using a fairly naive approach that finds roughly in the
> genome (which gene) each point might be and then checking it against the
> bins in that gene. If I split the problem into chromosomes, I feel sure
> there must be some super-fast matrix approach I can apply using numpy, but
> I'm struggling a bit. Can anybody suggest something?

You probably need something algorithmically better, like interval
trees. There are a couple of C/Python implementations floating around.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Overlapping ranges

2009-03-16 Thread Peter Saffrey


I'm trying to file a set of data points, defined by genome coordinates, into 
bins, also based on genome coordinates. Each data point is (chromosome, start, 
end, point) and each bin is (chromosome, start, end). I have about 140 million 
points to file into around 100,000 bins. Both are (roughly) evenly distributed 
over the 24 chromosomes (1-22, X and Y). Genome coordinates are integers and my 
data points are floats. For each data point, (end - start) is roughly 1000, but 
the bins are are of uneven widths. Bins might have also overlap - in that case, 
I need to know all the bins that a point overlaps.

By overlap, I mean the start or end of the data point (or both) is inside the 
bin or that the point entirely covers the bin.

At the moment, I'm using a fairly naive approach that finds roughly in the 
genome (which gene) each point might be and then checking it against the bins 
in that gene. If I split the problem into chromosomes, I feel sure there must 
be some super-fast matrix approach I can apply using numpy, but I'm struggling 
a bit. Can anybody suggest something?

Peter

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] 1.3.x branch created - trunk now opened for 1.4

2009-03-16 Thread David Cournapeau

Hi,

I have just started the 1.3.x branch - as such, any change done to the
trunk will not end up in the 1.3 release. I will announce the 1.3 beta
release within the day, hopefully,

cheers,

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] svn and tickets email status

2009-03-16 Thread Charles R Harris

2009/3/16 Ryan May 

> Hi,
>
> What's the status on SVN and ticket email notifications?  The only messages
> I'm seeing since the switch is the occasional spam.  Should I try
> re-subscribing?
>

I get the ticket notifications but I think the svn notifications are still
broken. I needed to update my email address to receive ticket notifications,
the mail was going to an old address after the change.

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Superfluous array transpose (cf. ticket #1054)

2009-03-16 Thread Pearu Peterson

On Mon, March 16, 2009 4:05 pm, Sturla Molden wrote:
> On 3/16/2009 9:27 AM, Pearu Peterson wrote:
>
>> If a operation produces new array then the new array should have the
>> storage properties of the lhs operand.
>
> That would not be enough, as 1+a would behave differently from a+1. The
> former would change storage order and the latter would not.

Actually, 1+a would be handled by __radd__ method and hence
the storage order would be defined by the rhs (lhs of the __radd__ method).

> Broadcasting arrays adds futher to the complexity of the problem.

I guess, similar rules should be applied to storage order then.

Pearu


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] svn and tickets email status

2009-03-16 Thread Ryan May

Hi,

What's the status on SVN and ticket email notifications?  The only messages
I'm seeing since the switch is the occasional spam.  Should I try
re-subscribing?

Ryan

-- 
Ryan May
Graduate Research Assistant
School of Meteorology
University of Oklahoma
Sent from: Norman Oklahoma United States.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Superfluous array transpose (cf. ticket #1054)

2009-03-16 Thread Sturla Molden

On 3/16/2009 9:27 AM, Pearu Peterson wrote:

> If a operation produces new array then the new array should have the
> storage properties of the lhs operand.

That would not be enough, as 1+a would behave differently from a+1. The 
former would change storage order and the latter would not.

Broadcasting arrays adds futher to the complexity of the problem.

It seems necessary to something like this to avoid the trap when using f2py:

def some_fortran_function(x):
if x.flags['C_CONTIGUOUS']:
shape = x.shape[::-1]
 _x = x.reshape(shape, order='F')
 _y = _f2py_wrapper(_x)
 shape = _y.shape[::-1]
 return y.reshape(shape, order='C')
else:
 return _f2py_wrapper(x)

And then preferably never use Fortran ordered arrays directly.

Sturla Molden

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Superfluous array transpose (cf. ticket #1054)

2009-03-16 Thread Pearu Peterson

On Sun, March 15, 2009 8:57 pm, Sturla Molden wrote:
>
> Regarding ticket #1054. What is the reason for this strange behaviour?
>
 a = np.zeros((10,10),order='F')
 a.flags
>   C_CONTIGUOUS : False
>   F_CONTIGUOUS : True
>   OWNDATA : True
>   WRITEABLE : True
>   ALIGNED : True
>   UPDATEIFCOPY : False
 (a+1).flags
>   C_CONTIGUOUS : True
>   F_CONTIGUOUS : False
>   OWNDATA : True
>   WRITEABLE : True
>   ALIGNED : True
>   UPDATEIFCOPY : False

I wonder if this behavior could be considered as a bug
because it does not seem to have any advantages but
only hides the storage order change and that may introduce
inefficiencies.

If a operation produces new array then the new array should have the
storage properties of the lhs operand.
That would allow writing code

  a = zeros(, order='F')
  b = a + 1

instead of

  a = zeros(, order='F')
  b = a[:]
  b += 1

to keep the storage properties in operations.

Regards,
Pearu

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] inplace dot products

2009-03-16 Thread David Warde-Farley

On 20-Feb-09, at 6:41 AM, Olivier Grisel wrote:

> Alright, thanks for the reply.
>
> Is there a canonical way /sample code to gain low level access to  
> blas / lapack
> atlas routines using ctypes from numpy / scipy code?
>
> I don't mind fixing the dimensions and the ndtype of my array if it  
> can
> decrease the memory overhead.

I got some  clarification from Pearu Peterson off-list.

For gemm the issue is that if the matrix C is not Fortran-ordered, it  
will be copied, and that copy will be over-written. order='F' when  
creating the array being overwritten will fix this.

DWF
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Overlapping ranges

Re: [Numpy-discussion] Overlapping ranges

[Numpy-discussion] Overlapping ranges

[Numpy-discussion] 1.3.x branch created - trunk now opened for 1.4

Re: [Numpy-discussion] svn and tickets email status

Re: [Numpy-discussion] Superfluous array transpose (cf. ticket #1054)

[Numpy-discussion] svn and tickets email status

Re: [Numpy-discussion] Superfluous array transpose (cf. ticket #1054)

Re: [Numpy-discussion] Superfluous array transpose (cf. ticket #1054)

Re: [Numpy-discussion] inplace dot products

10 matches

Site Navigation

Mail list logo

Footer information