Re: Spark MLlib: Should I call .cache before fitting a model?

Nick Pentreath Tue, 27 Feb 2018 11:49:55 -0800

Currently, fit for many (most I think) models will cache the input data.
For LogisticRegression this is definitely the case, so you won't get any
benefit from caching it yourself.


On Tue, 27 Feb 2018 at 21:25 Gevorg Hari <gevorgh...@gmail.com> wrote:

> Imagine that I am training a Spark MLlib model as follows:
>
> val traingData = loadTrainingData(...)val logisticRegression = new 
> LogisticRegression()
>
> traingData.cacheval logisticRegressionModel = 
> logisticRegression.fit(trainingData)
>
> Does the call traingData.cache improve performances at training time or
> is it not needed?
>
> Does the .fit(...) method for a ML algorithm call cache/unpersist
> internally?
>
>

Re: Spark MLlib: Should I call .cache before fitting a model?

Reply via email to