viirya commented on a change in pull request #25789: [SPARK-28927][ML] Show warning when input data to ALS is indeterminate URL: https://github.com/apache/spark/pull/25789#discussion_r324476149
########## File path: R/pkg/R/mllib_recommendation.R ########## @@ -82,6 +82,10 @@ setClass("ALSModel", representation(jobj = "jobj")) #' statsS <- summary(modelS) #' } #' @note spark.als since 2.1.0 +#' @note the input rating dataframe to the ALS implementation should not be indeterminate. Review comment: If randomSplit or sample is computed, the only way to make it deterministic is to checkpoint. But obviously we can't checkpoint for users. I think it is good to leave a clue so users can know what is going on when hit the ArrayIndexOutOfBoundsException during fitting ALS model. Since we don't want to break existing user code, a warning is the least thing we can do? I am also like to catch the exception and re-throw a meaningful message to users. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org