You may also take a look at PredictionIO, which can persist and then deploy MLlib models as web services.
Simon On Sunday, March 8, 2015, Sean Owen <so...@cloudera.com> wrote: > You dont need SparkContext to simply serialize and deserialize objects. It > is Java mechanism. > On Mar 8, 2015 10:29 AM, "Xi Shen" <davidshe...@gmail.com > <javascript:_e(%7B%7D,'cvml','davidshe...@gmail.com');>> wrote: > >> errr...do you have any suggestions for me before 1.3 release? >> >> I can't believe there's no ML model serialize method in Spark. I think >> training the models are quite expensive, isn't it? >> >> >> Thanks, >> David >> >> >> On Sun, Mar 8, 2015 at 5:14 AM Burak Yavuz <brk...@gmail.com >> <javascript:_e(%7B%7D,'cvml','brk...@gmail.com');>> wrote: >> >>> Hi, >>> >>> There is model import/export for some of the ML algorithms on the >>> current master (and they'll be shipped with the 1.3 release). >>> >>> Burak >>> On Mar 7, 2015 4:17 AM, "Xi Shen" <davidshe...@gmail.com >>> <javascript:_e(%7B%7D,'cvml','davidshe...@gmail.com');>> wrote: >>> >>>> Wait...it seem SparkContext does not provide a way to save/load object >>>> files. It can only save/load RDD. What do I missed here? >>>> >>>> >>>> Thanks, >>>> David >>>> >>>> >>>> On Sat, Mar 7, 2015 at 11:05 PM Xi Shen <davidshe...@gmail.com >>>> <javascript:_e(%7B%7D,'cvml','davidshe...@gmail.com');>> wrote: >>>> >>>>> Ah~it is serializable. Thanks! >>>>> >>>>> >>>>> On Sat, Mar 7, 2015 at 10:59 PM Ekrem Aksoy <ekremak...@gmail.com >>>>> <javascript:_e(%7B%7D,'cvml','ekremak...@gmail.com');>> wrote: >>>>> >>>>>> You can serialize your trained model to persist somewhere. >>>>>> >>>>>> Ekrem Aksoy >>>>>> >>>>>> On Sat, Mar 7, 2015 at 12:10 PM, Xi Shen <davidshe...@gmail.com >>>>>> <javascript:_e(%7B%7D,'cvml','davidshe...@gmail.com');>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I checked a few ML algorithms in MLLib. >>>>>>> >>>>>>> https://spark.apache.org/docs/0.8.1/api/mllib/index.html# >>>>>>> org.apache.spark.mllib.classification.LogisticRegressionModel >>>>>>> >>>>>>> I could not find a way to save the trained model. Does this means I >>>>>>> have to train my model every time? Is there a more economic way to do >>>>>>> this? >>>>>>> >>>>>>> I am thinking about something like: >>>>>>> >>>>>>> model.run(...) >>>>>>> model.save("hdfs://path/to/hdfs") >>>>>>> >>>>>>> Then, next I can do: >>>>>>> >>>>>>> val model = Model.createFrom("hdfs://...") >>>>>>> model.predict(vector) >>>>>>> >>>>>>> I am new to spark, maybe there are other ways to persistent the >>>>>>> model? >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>> >>>>>>