You can always define an RDD transpose function yourself. This is what I use in 
PySpark to transpose an RDD of numpy vectors. It’s not optimal and the vectors 
need to fit in memory on the worker nodes.
def rddTranspose(rdd):
    # add an index to the rows and the columns, result in triplet
    dataT1 = data.zipWithIndex().flatMap(lambda (x,i): [(i,j,e) for (j,e) in 
enumerate(x)])
    # use the column from the original as key and group and sort
    dataT2 = dataT1.map(lambda (i,j,e): (j, (i,e)))\
                   .groupByKey().sortByKey()
    # Sort the lists inside the rows
    dataT3 = dataT2.map(lambda (i, x): sorted(list(x), cmp=lambda 
(i1,e1),(i2,e2): cmp(i1, i2)))
    # Remove the indices inside the rows
    dataT4 = dataT3.map(lambda x: map(lambda (i, y): y , x))
    # convert to numpy arrays in the rows
    return dataT4.map(lambda x: np.asarray(x))

Cheers,
Toni

On 12 Jan 2015 at 20:45:58, Alex Minnaar (aminn...@verticalscope.com) wrote:

That's not quite what I'm looking for.  Let me provide an example.  I have a 
rowmatrix A that is nxm and I have two local matrices b and c.  b is mx1 and c 
is nx1.  In my spark job I wish to perform the following two computations



A*b



and



A^T*c



I don't think this is possible without being able to transpose a rowmatrix.  Am 
I correct?



Thanks,



Alex

From: Reza Zadeh <r...@databricks.com>
Sent: Monday, January 12, 2015 1:58 PM
To: Alex Minnaar
Cc: u...@spark.incubator.apache.org
Subject: Re: RowMatrix multiplication
 
As you mentioned, you can perform A * b, where A is a rowmatrix and b is a 
local matrix.

From your email, I figure you want to compute b * A^T. To do this, you can 
compute C = A b^T, whose result is the transpose of what you were looking for, 
i.e. C^T = b * A^T. To undo the transpose, you would have transpose C manually 
yourself. Be careful though, because the result might not have each Row fit in 
memory on a single machine, which is what RowMatrix requires. This danger is 
why we didn't provide a transpose operation in RowMatrix natively.

To address this and more, there is an effort to provide more comprehensive 
linear algebra through block matrices, which will likely make it to 1.3:
https://issues.apache.org/jira/browse/SPARK-3434

Best,
Reza

On Mon, Jan 12, 2015 at 6:33 AM, Alex Minnaar <aminn...@verticalscope.com> 
wrote:
I have a rowMatrix on which I want to perform two multiplications.  The first 
is a right multiplication with a local matrix which is fine.  But after that I 
also wish to right multiply the transpose of my rowMatrix with a different 
local matrix.  I understand that there is no functionality to transpose a 
rowMatrix at this time but I was wondering if anyone could suggest a any kind 
of work-around for this.  I had thought that I might be able to initially 
create two rowMatrices - a normal version and a transposed version - and use 
either when appropriate.  Can anyone think of another alternative?



Thanks,



Alex


Reply via email to