Maybe I'm missing something obvious, but couldn't you easily write your own 
'cross' function that uses a couple nested for-loops to do the arithmetic 
without any intermediate allocations at all?

On Tuesday, July 7, 2015 at 6:24:34 PM UTC-4, Matthieu wrote:
>
> Thanks, this is what I currently do :)
>
> However, I'd like to find a solution that is both memory efficient (X can 
> be very large) and which does not modify X in place.
>
> Basically, I'm wondering whether there was a BLAS subroutine that would 
> allow to compute cross(X, w, Y) in one pass without creating an 
> intermediate matrix as large as X or Y.
>
>

Reply via email to