Github user frreiss commented on the pull request: https://github.com/apache/spark/pull/8563#issuecomment-143096254 Review comments: --> Try making the portion of the examples with input data more condensed; perhaps by reading the matrices from string constants. --> There's an orphan/duplicate ScalaDoc comment at line 356 of BlockMatrix.scala --> Remove carriage return on lines 370, 481, 569, 732 --> Recommend adding a call to the new subtract method to the MLLib programmer's guide --> New API calls to BlockMatrix should have corresponding PySpark APIs --> Error message at line 394 should print out the block sizes that don't match --> The code at line 384 should multiply every element of b.head by -1 as far as I can see --> Line 456 and 465-471 have wrong indentation --> Scaladoc at 474 should state that blockRowRange and blockColRange are block indexes, not row/column indexes --> In lines 460-463, consider making a single pass over the blocks instead of 4 passes --> Add a note to SchurComplement that the current implementation assumes that a11 fits into memory on the driver --> Might want to use a case class in return type of blockLUtoSolver --> Take a close look at the performance impact of the chains of multiplications at line 811 when there are many levels of recursion --> In recursiveSequencesBuild, you may want to break off more than one block from the upper left corner of the matrix; in many cases, the available memory on the driver can comfortably hold, say 10x10 blocks. You should be able to query the SparkContext's memory information to determine how much heap space is available for the in-memory portion of the computation. On a side note, it would be good to produce a user-friendly error if it looks like there is not enough local heap to process one block's data locally. --> Might want to reuse buffers for the local data allocated at lines 623-629 to avoid triggering global GC at each level of recursion
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org