Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/17459
  
    > I considered having toBlockMatrix check if the rows of IndexedRowMatrix 
were dense or sparse, but there is no guarantee of consistency. Like, an 
IndexedRowMatrix could be a mix of Dense and Sparse Vectors. In that case, it 
would not be clear what type of BlockMatrix to create. A decent approximation 
of this would be to just decide the matrix type based on the first vector we 
look at in the iterator we get from groupByKey, creating a mix of Dense and 
Sparse matrices in a BlockMatrix, but I still think it's best to be explicit. 
Also, we currently have the description of toBlockMatrix promising to make a 
BlockMatrix backed by instances of SparseMatrix, so we have made promises to 
users about the composition of the BlockMatrix before.
    
    I don't mean we don't care about it. I meant there is no guarantee that 
`BlockMatrix` is purely consisted of `DenseMatrix` or `SparseMatrix`. It could 
be a mix of them.
    
    Thus, we can have a `toBlockMatrix` which creates a `BlockMatrix` which is 
a mix of `DenseMatrix` and `SparseMatrix`. A block in a `BlockMatrix` can be a 
`DenseMatrix` and `SparseMatrix`, depending on the ratio of values in the 
block. Yes, it is like `a decent approximation` you talked.
    
    For a `IndexedRowMatrix` completely consisted of `DenseVector`, this 
`toBlockMatrix` definitely returns a `BlockMatrix` backed by `DenseMatrix`. For 
other cases, `DenseMatrix` might not be best choice for all blocks in the 
`BlockMatrix`, as many blocks will be sparse.
    
    About the promise that `toBlockMatrix` makes a `BlockMatrix` backed by 
instances of `SparseMatrix`, as I said it is not explicitly bound to the API 
level. I think it is not a big problem.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to