cv.fit is going to give you a CrossValidatorModel, if you want to extract
the real model built. You need to do

    val cvModel = cv.fit(data)

    val plmodel = cvModel.bestModel.asInstanceOf[PipelineModel]

    val model = plmodel.stages(2).asInstanceOf[whatever_model]

then you can model.save

On 19 January 2017 at 11:31, Minudika Malshan <minudika...@gmail.com> wrote:

> Hi,
>
> Thanks Rezaul and Asher Krim.
>
> The method suggested by Rezaul works fine for NaiveBayes but still fails
> for RandomForest and Multi-layer perceptron classifier.
> Everything properly is saved until this stage.
>
> CrossValidator cv = new CrossValidator()
>         .setEstimator(pipeline)
>         .setEvaluator(evaluator)
>         .setEstimatorParamMaps(paramGrid)
>         .setNumFolds(folds);
>
> Any idea on how to resolve this?
>
>
>
>
>
> On Thu, Jan 12, 2017 at 9:13 PM, Asher Krim <ak...@hubspot.com> wrote:
>
>> What version of Spark are you on?
>> Although it's cut off, I think your error is with RandomForestClassifier,
>> is that correct? If so, you should upgrade to spark 2 since I think this
>> class only became writeable/readable in Spark 2 (
>> https://github.com/apache/spark/pull/12118)
>>
>> On Thu, Jan 12, 2017 at 8:43 AM, Md. Rezaul Karim <
>> rezaul.ka...@insight-centre.org> wrote:
>>
>>> Hi Malshan,
>>>
>>> The error says that one (or more) of the estimators/stages is either not
>>> writable or compatible that supports overwrite/model write operation.
>>>
>>> Suppose you want to configure an ML pipeline consisting of three stages
>>> (i.e. estimator): tokenizer, hashingTF, and nb:
>>>     val nb = new NaiveBayes().setSmoothing(0.00001)
>>>     val tokenizer = new Tokenizer().setInputCol("label
>>> ").setOutputCol("label")
>>>     val hashingTF = new HashingTF().setInputCol(tokeni
>>> zer.getOutputCol).setOutputCol("features")
>>>     val pipeline = new Pipeline().setStages(Array(tokenizer, hashingTF,
>>> nb))
>>>
>>>
>>> Now check if all the stages are writable. And to make it ease try saving
>>> stages individually:  -e.g. tokenizer.write.save("path")
>>>
>>>
>>> hashingTF.write.save("path")
>>> After that suppose you want to perform a 10-fold cross-validation as
>>> follows:
>>>     val cv = new CrossValidator()
>>>               .setEstimator(pipeline)
>>>               .setEvaluator(new BinaryClassificationEvaluator)
>>>               .setEstimatorParamMaps(paramGrid)
>>>               .setNumFolds(10)
>>>
>>> Where:
>>>     val paramGrid = new ParamGridBuilder()
>>>                             .addGrid(hashingTF.numFeatures, Array(10,
>>> 100, 1000))
>>>                             .addGrid(nb.smoothing, Array(0.001, 0.0001))
>>>                             .build()
>>>
>>> Now the model that you trained using the training set should be writable
>>> if all of the stages are okay:
>>>     val model = cv.fit(trainingData)
>>>     model.write.overwrite().save("output/NBModel")
>>>
>>>
>>>
>>> Hope that helps.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Regards,
>>> _________________________________
>>> *Md. Rezaul Karim*, BSc, MSc
>>> PhD Researcher, INSIGHT Centre for Data Analytics
>>> National University of Ireland, Galway
>>> IDA Business Park, Dangan, Galway, Ireland
>>> Web: http://www.reza-analytics.eu/index.html
>>> <http://139.59.184.114/index.html>
>>>
>>> On 12 January 2017 at 09:09, Minudika Malshan <minudika...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> When I try to save a pipeline model using spark ML (Java) , the
>>>> following exception is thrown.
>>>>
>>>>
>>>> java.lang.UnsupportedOperationException: Pipeline write will fail on
>>>> this Pipeline because it contains a stage which does not implement
>>>> Writable. Non-Writable stage: rfc_98f8c9e0bd04 of type class
>>>> org.apache.spark.ml.classification.Rand
>>>>
>>>>
>>>> Here is my code segment.
>>>>
>>>>
>>>> model.write().overwrite,save
>>>>
>>>>
>>>> model.write().overwrite().save("path
>>>> model.write().overwrite().save("mypath");
>>>>
>>>>
>>>> How to resolve this?
>>>>
>>>> Thanks and regards!
>>>>
>>>> Minudika
>>>>
>>>>
>>>
>>
>>
>> --
>> Asher Krim
>> Senior Software Engineer
>>
>
>
>
> --
> *Minudika Malshan*
> Undergraduate
> Department of Computer Science and Engineering
> University of Moratuwa
> Sri Lanka.
> <https://lk.linkedin.com/pub/minudika-malshan/100/656/a80>
>
>
>

Reply via email to