Spark MLlib: Should I call .cache before fitting a model?

Gevorg Hari Tue, 27 Feb 2018 11:25:10 -0800

Imagine that I am training a Spark MLlib model as follows:

val traingData = loadTrainingData(...)val logisticRegression = new
LogisticRegression()


traingData.cacheval logisticRegressionModel =
logisticRegression.fit(trainingData)

Does the call traingData.cache improve performances at training time or is
it not needed?

Does the .fit(...) method for a ML algorithm call cache/unpersist
internally?

Spark MLlib: Should I call .cache before fitting a model?

Reply via email to