Github user thunterdb commented on the issue: https://github.com/apache/spark/pull/14299 @AnthonyTruchet thank you for the PR. This is definitely worth fixing for large deployments. Now, as you noticed, this portion of code does not quite abide by the best engineering practices... Instead of adding an extra layer of nesting, would you mind make the following changes? ```scala def fit[S <: Iterable[String]](dataset: RDD[S]): Word2VecModel = { ... val expTable = sc.broadcast(createExpTable()) val bcVocab = sc.broadcast(vocab) val bcVocabHash = sc.broadcast(vocabHash) try { fit0(expTable, ...) } finally { ... } } private final def fit0(...) { // Put all the content here. // Note that the inner code also includes some broadcasts, you may want to fix these as well if you can } ``` I personally agree about resource management and scala-arm. We try to keep scala dependencies to a minimum, unfortunately, because they can be very tedious to move from one scala version to another.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org